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Baculoviruses  are  pathogenic  to  insects.  Presently, 

their  origin  and  evolutionary  paths  are  not  clearly 

understood.     Using  a  baculovirus  structural  protein  gene, 

gp41,   that  has  been  shown  to  be  highly  conserved  among 

baculoviruses,   the  gene  transcription,   protein  structure, 

genomic  structure  and  phylogenetic  relationships  were 

studied . 

Two  complete  gp41  nucleotide  sequences  from  Spodoptera 
frugiperda  multiple  nucleocapsid  nucleopolyhedrovirus 
(SfMNPV-2)    and  Anticarsia  gemmatalis  MNPV   ( AgMNPV- 2D) ,    and  a 
partial  gp41  gene  from  Lymantria  dispar  MNPV   (LdMNPV) ,  were 


xi 


sequenced . 

Northern  blot  analysis  showed  that  the  SfMNPV-2  gp41 
was  a  late  gene  expressed  12  hours  post-infection.     The  gp41 
promoter  region  contained  three  transcriptional  start  sites, 
two  within  a  consensus  transcriptional  start  site    (TAAG)  of 
baculovirus  late  genes,   and  the  other  located  in  a  region 
where  no  consensus  motif  has  been  determined. 

The  comparison  of  nucleotide  and  amino  acid  sequences 
of  the  AgMNPV- 2D  with  four  other  NPVs,  Autographa 
calif ornica  MNPV   (AcMNPV) ,   Bombyx  mori  MNPV   (BmMNPV) ,  SfMNPV 
and  Helicoverpa  zea  single  nucleocapsid  nucleopolyhedrovirus 
(HzSNPV) ,   showed  a  minimum  of  59%  nucleotide  identity  and 
70%  amino  acid  similarity.     Analysis  of  the  hydrophobicity 
and  protein  secondary  structure  of  gp41  revealed  several 
conserved  domains  including  eight  a-helix,   four  loop,  one 
P-sheet  and  one  transmembrane  domains. 

The  analysis  of  the  gp41  upstream  and  downstream 
regions  from  those  five  NPVs  showed  that  they  contained 
vlf-l  gene,   ORF  330,   ORF  300,   gp41  and  CRF  >667  positioned 
from  right  to  left  and  with  a  similar  arrangement  in  their 
genomic  maps.     Among  these  ORFs,    the  AgMNPV- 2D  shared  50  to 
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70%  nucleotide  identity  and  60  to  90%  ammo  acid  similarity 
with  the  four  other  NPVs . 

Six  baculovirus  genes  including  polyhedrin   (polh) ,  plO, 
gp41,   gp64,   DNA  polymerase    {dnapol)   and  ecdysteroid  UDP- 
glucosyltransf erase   (egt) ,   were  used  to  reconstruct 
phylogenetic  trees.     The  results  confirmed  that  hymenopteran 
NPVs  diverged  earlier  from  lepidopteran  granuloviruses  (GVs) 
and  lepidopteran  NPVs,   later  lepidopteran  GVs  diverged  from 
lepidopteran  NPVs.     The  dnapol  phylogenetic  tree  also  showed 
that  the  baculoviruses  had  an  independent  evolutionary  path 
from  two  other  insect  DNA  viruses,   Spodoptera  ascovirus 
(SAV)   and  Choristoneura  fumiferana  entomopoxvirus    (CbEPV) . 
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CHAPTER  1 
INTRODUCTION  TO  BACULOVI RUSES 

Review 

Scientific  literature  on  the  study  of  baculoviruses 
goes  back  to  the  beginning  of  the  nineteenth  century,  and 
now  includes  thousands  of  scientific  articles  that  have 
contributed  to  the  understanding  of  this  class  of  viruses. 
Some  papers  cover  fundamental  studies  such  as  those 
involving  the  baculovirus  infection  processes    (Volkman  & 
Keddie,    1990;  Granados  &  Williams,   1986),   the  baculovirus 
structural  proteins   (Summers  &  Smith,   1978;  Maruniak,  1979, 
1986;  Rohrmann,   1992),   the  baculovirus  DNA  genome   (Ayres  et 
al . ,   1994),   and  the  regulation  of  gene  expression   (Friesen  & 
Miller,   1986;  Blissard  and  Rohrmann,   1990).     Studies  dealing 
with  the  application  of  baculoviruses  in  agriculture  and 
biotechnology  such  as  the  use  of  baculoviruses  as  biological 
control  agents   (Huber,   1986;  Bonning  &  Hammock,  1992; 
Moscardi  &  Sosa-Gomez,   1993)    and  the  baculovirus  expression 
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system   (Summers  &  Smith,   1987;   King  &  Possee,   1992;  O'Reilly 
et  al.,   1992;  Richardson,   1995;   Shuler  et  al . ,   1995)  have 
also  been  reported. 

Fundamental  Studies  on  Baculoviruses 

The  fundamental  characteristics  of  baculoviruses  have 
been  described  in  several  review  papers  and  books  (Granados 
&  Federici,    1986;   Blissard  &  Rohrmann,    1990;   Tanada  &  Kaya, 
1993;  Miller,    1996) .     These  reviews  include  the  study  of 
viral  particles,   nucleocapsids ,   enveloped  virions, 
infectious  elements,   the  viral  infection  pathway, 
cytopathology ,   viral  replication,   host  specificity,  viral 
gene  regulation,   and  viral  DNA  replication.     In  this 
section,   the  viral  infection  process,   structural  proteins, 
DNA  genome  and  regulation  of  gene  expression  of 
baculoviruses  will  be  briefly  discussed. 

Baculovirus  infection 

Baculoviruses  have  an  enveloped  rod- shaped  virion 
(Federici,   1986) .     The  virions  are  generally  40-50  nm  in 
diameter  and  200-400  nm  in  length   (Bilimoria,   1986) .  The 


baculoviruses  are  divided  into  two  genera  based  upon  the 
morphology  of  the  inclusion  bodies   (IBs)    (Murphy  et  al . , 
1995) .     Virions  of  the  genus  Nucleopolyhedrovirus   (NPV)  are 
occluded  in  a  proteinaceous  matrix,   the  polyhedron.  The 
polyhedron  ranges  from  0.5  to  15  um,   and  there  are  usually 
several  virions  embedded  in  each  polyhedron  (Federici, 
1986) .     Two  subtypes  of  NPVs  have  been  found:   the  single 
nucleocapsid  NPV   (SNPV)   contains  only  one  nucleocapsid  per 
envelope,   and  the  multiple  nucleocapsid  NPV   (MNPV)  contains 
several  nucleocapsids    (1-17)   per  envelope   (Bilimoria,  1986). 
The  second  genus,   Granulovirus   (GV) ,   contains  only  one 
virion  occluded  in  an  oval  shaped  proteinaceous  matrix,  and 
ranges  in  size  from  160  to  300  nm  in  width  by  300  to  500  nm 
in  length   (Federici,   1986)  .     The  virion  of  GVs  usually 
consists  of  one  nucleocapsid  per  envelope,   but  in  a  few 
cases  has  been  found  to  have  more  than  one    (Murphy  et  aJ . , 
1995)  . 

Two  different  types  of  virions  are  produced  during  the 
replication  cycle  of  baculovirus .     One  is  the  occluded 
virion   (OV)   that  is  only  found  inside  the  polyhedron 
(Volkman,    1986) ,   and  the  other  is  the  budded  virion  (BV) 
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that  functions  in  cell  to  cell  infection   (Granados  &  Lawler, 
1981) .     The  OV  is  occluded  in  either  a  polyhedron   (for  NPV) 
or  granule   (for  GV) .     The  polyhedron  protects  the  virion 
from    environmental  decay.     Upon  ingestion  by  insect  larvae, 
the  polyhedra  are  dissolved  in  the  midgut's  alkaline  juices 

(Pritchett  et  al . ,   1982).     The  liberated  OVs  then  penetrate 
the  peritrophic  membrane  and  infect  the  columnar  epithelial 
cells    (Tanada  et  al .  ,   1975) .     This  step  marks  the  end  of  the 
primary  infection.     The  budded  virions  that  are  produced  in 
the  infected  nucleus  of  columnar  cells  then  cause  a 
secondary  infection   (Granados  &  Williams,   1986) .     The  BVs  go 
through  the  hemocoel  to  infect  other  cells  such  as  those  of 
the  tracheal  and  the  connective  tissues   (Adams  et  al . ,  1977; 
Keddie  et  al . ,    1989;   Volkman  &  Keddie,    1990).     Late  in  the 
infection,   occluded  virions  are  formed  in  the  nuclei  of 
infected  cells.     The  progeny  virions   (BVs)   are  found  as 
early  as  sixteen  hours  after  initiation  of  the  infection 

(Granados  &  Lawler,   1981) .     Polyhedra  are  found  starting  at 
24  hours  post  -  infect ion   (P.I.)    (Granados  &  Lawler,   1981)  and 
are  released  upon  cell  death. 
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Baculovirus  structural  proteins 

Although  BVs  and  OVs  have  identical  DNA  genomes  (Smith 
&  Summers,   1978)  ,   the  surrounding  membrane  and  proteins  axe 
very  different   (Summers  &  Volkman,   1976) .     The  OV  membrane 
is  formed  in  the  nuclei  by  de  novo  synthesis    (Stoltz  et  al . , 
1973) ,   while  the  BV  membrane  is  constructed  from  the 
cytoplasmic  membrane    (Tanada. &  Hess,   1976;  Adams  et  al .  , 
1977) .     The  differences  between  OV  and  BV  membrane 
composition  in  Autographa  calif omica  MNPV   (AcMNPV)  have 
been  studied   (Braunagel  &  Summers,   1994) .     The  protein  and 
the  lipid  compositions  were  both  compared,   and  it  was 
observed  that  the  major  BV  phospholipid  is 
phosphatidylserine,   while  the  major  OV  lipids  are 
phosphatidylcholine  and  phosphatidylethanolamine .  The 
results  also  indicated  that  the  nuclear  membrane  of  infected 
Spodoptera  frugiperda  cell  line   (Sf9)   has  a  different  lipid 
compositions  compared  to  the  OVs  and  BVs. 

The  protein  composition  of  OVs  and  BVs  were  analyzed, 
and  the  dominant  phosphoproteins  differed  between  the  two 
virions.     The  OVs  have  a  36  kDa  major  phosphoprotein,  while 
the  BVs  have  a  85  kDa  major  phosphoprotein.  Glycoprotein 


6 

analysis  showed  that  more  glycoproteins  were  present  in  BV 
than  OV.     The  BV  specific  glycoproteins  are  136,   128,    89,  45 
and  40  kDa,   and  the  OV  specific  glycoproteins  are  70,  53, 
49,   42  and  40  kDa.     Moreover,   several  specific  OV  structural 
proteins  were  identified.     These  proteins  include  the  ODV- 
E18,    ODV-E35,    ODV-E27,    ODV-E56  and  ODV-E66    (Maruniak  & 
Summers,   1981;  Hong  et  al . ,    1994;  Braunagel  et  ai . ,  1996a, 
1996b;   Theilmann  et  al . ,    1996).       These  OV  specific 
proteins,   such  as  ODV-E56  and  ODV-E66,   may  be  involved  in 
the  production  of  intranuclear  membrane  and  protein 
transport  and  insertion  into  the  viral  envelope  membrane 
(Braunagel  et  al .  ,    1996a;  1996b). 

The  gp41  gene  also  has  been  shown  to  code  for  an  OV 
specific  protein   (Whitford  &  Faulkner,    1992a) .     Gp41  genes 
are  highly  conserved  with  60%  nucleotide  sequence  homology 
among  four  different  baculoviruses    (Liu  &  Maruniak,   1995)  . 
The  gp41  protein  was  identified  as  an  O-linked  glycoprotein, 
and  its  localization  was  predicted  to  be  in  the  tegument 
(Whitford  &  Faulkner,   1992a) .     Although  the  biological 
function  of  gp41  protein  has  not  yet  been  defined,    it  may 
have  functions  similar  to  those  of  other  OV  specific 
proteins,   such  as  formation  of  the  envelope  membrane  and/or 


protein  transport  into  the  membrane.     Another  OV  specific 
protein,  p74,   has  been  proved  to  be  essential  for  virulence 
of  baculoviruses .     Polyhedra  produced  by  the  AcMNPV  virus 
with  mutations  in  the  p74  gene  failed  to  kill  Trichoplusia 
ni  larvae  per  os   (Kuzio  et  al . ,   1989) .     This  indicated  that 
p74  is  required  for  viral  infect ivity.     However,   details  of 
the  mechanism  of  p74  protein  function  still  need  to  be 
elucidated . 

In  contrast  to  the  OV  specific  proteins,   the  gp64 
protein  is  specifically  found  in  BV   (Blissard  &  Rohrmann, 
1989;  Whitford  et  al . ,    1989)   and  plays  an  important  role  in 
cell  to  cell  infection   (Volkman  &  Goldsmith,   1984) .  The 
gp64  protein  is  concentrated  at  one  end  of  the  virion 
membrane  and  may  be  involved  in  a  pH  dependent  fusion  with 
the  host  cell  endosomal  membrane   (Volkman  &  Goldsmith, 
1985) .     Furthermore,   gp64  has  been  shown  to  be  a  type  I 
integral  membrane  protein  with  one  membrane  fusion  domain 
and  one  oligomerizat ion  domain   (Monsma  &  Blissard,  1995; 
Monsma  et  al . ,   1996).     Gp64  is  highly  glycosylated,  and 
glycosylation  is  required  for  the  incorporation  of  gp64  into 
the  virion  envelope   (Rohrmann,   1992) .     In  addition,   a  signal 
peptide  sequence  was  found  in  the  N-terminal  of  gp64  that 
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was  missing  in  the  mature  form  of  the  protein  (Rohrmann, 
1992)  . 

Besides  the  OV  and  BV  structural  proteins,   there  are 
three  other  major  structural  proteins  found  in 
baculoviruses :  polyhedrin,   PE,   and  plO  proteins.  Polyhedrin 
is  the  basic  subunit  of  polyhedra  and  is  reported  to  be  a 
2  9  kDa  protein  with  highly  conserved  amino  acid  sequences 
between  NPVs  and  GVs    (Akiyoshi  et  al . ,   1985;  Maruniak,  1986; 
Blissard  &  Rohrmann,   1990) .     It  has  80%  identity  among 
lepidopteran  NPVs,   50%  identity  between  the  lepidopteran 
NPVs  and  GVs,   and  40%  identity  between  the  lepidopteran  and 
hymenopteran  NPVs   (Rohrmann,   1992) .     The  carboxyl  terminal 
and  central  region  of  polyhedrin  genes  are  highly  conserved, 
but  the  N-terminal  is  less  conserved   (Akiyoshi  et  al . ,  1985; 
Chakerian  et  al . ,   1985;  Rohrmann,   1986).     The  cytoplasmic 
polyhedrosis  virus    (CPV)   also  produces  polyhedrin  protein  to 
form  a  type  of  polyhedra.     However,   the  polyhedrin  amino 
acid  composition  between  NPVs  and  CPVs  are  different 
quantitatively  and  qualitatively   (Maruniak,    1986;  Rohrmann, 
1986)  . 

An  electron-dense  envelope  named  polyhedron  membrane  or 
polyhedron  calyx  surrounds  the  polyhedra    (Rohrmann,   1992)  . 
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PE   (polyhedron  electron-dense  envelope)   protein  has  been 
suggested  to  be  a  major  component  of  the  PE,   and  is 
phosphorylated  and  thiolly  linked  to  the  carbohydrate 
component  of  the  polyhedron  envelope    (Minion  et  al .  ,  1979; 
Whitt  &  Manning,   1988;  Rohrmann,    1992) .     The  PE  gene  is  a 
late  gene,   expressed  at  48  hours  post  infection   (Russell  & 
Rohrmann,   1990) .     The  PE  nucleotide  homology  among  AcMNPV, 
OpMNPV  and  LdMNPV  is  58,   27  and  34%,   respectively  (Rohrmann, 
1992) .     Thus,   the  PE  protein  is  not  highly  conserved  among 
the  different  baculoviruses . 

The  plO  protein  has  been  proved  to  be  an  essential  gene, 
for  polyhedra  formation.     Three  functional  domains  of  plO 
proteins  were  identified  in  AcMNPV  using  a  site  directed 
mutation  analysis   (van  Oers  et  al . ,    1993).     These  functional 
domains  include  a  fibrillar  structure  formation  domain  (15 
amino  acids  from  the  carboxyl  terminus) ,   a  nuclear 
disintegration  domain   (amino  acid  residue  52-79),   and  an 
intermolecular  binding  domain   (the  amino  terminal  half  of 
the  plO  protein) .     The  unsuccessful  substitution  of  the 
AcMNPV  plO  gene  with  the  Spodoptera  exigua  MNPV   (SeMNPV)  plO 
gene  indicated  that  at  least  one  virus-specific  factor  was 
required  to  interact  with  the  plO  protein  for  nuclear 
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disintegration   (van  Oers  et  al .  ,    1994) .     In  general,  the 
homology  of  plO  genes  among  baculoviruses  is  very  low;  there 
is  only  42,   26  and  38%  amino  acid  sequence  identity  among 
AcMNPV,   SeMNPV  and  OpMNPV,   respectively   (Rohrmann,    1992) . 

Baculovirus  DNA  genome 

Baculoviruses  are  double  stranded  DNA  viruses  with  the 
genome  size  ranging  from  88  to  160  kilobase  pairs  (kb) 

(Burgess,    1977;   Blissard  &  Rohrmann,    1990).     The  genomic 
structure  among  baculoviruses  has  been  shown  to  be  similar 

(Leisy  et  al . ,   1984).     The  alignment  of  AcMNPV,  Orgyia 
pseudotsugata  MNPV   (OpMNPV) ,    and  SeMNPV  genomes  showed  that 
these  baculoviruses  have  similar  locations  for  the 
polyhedrin  gene,   plO  gene  and  ecdysteroid  UDP- 
glucosyltransf erase   (egrt)   gene   (van  Strien  et  al .  ,  1996). 
On  the  other  hand,   the  genomic  location  of  the  ubiquitin 
gene  is  different  among,  these  baculoviruses,   and  this 
difference  is  probably  caused  by  gene  rearrangement.  Gene 
rearrangement  is  also  apparent  for  the  gp41  genes  of  five 
different  NPVs   (Chapter  3) . 

The  genomic  DNA  sequences  of  AcMNPV   (Ayres  et  al . , 
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1994)   and  BmMNPV   (Maeda,   unpublished  data;  GenBank  accession 
number,   L3318  0)   have  been  completed  and  provide  valuable 
information  in  analyzing  the  potential  open  reading  frames 
(ORFs) .     In  AcMNPV,    154  potential  ORFs    (greater  than  150 
nucleotides  in  length)   and  the  potential  transcription 
motifs  of  these  ORFs  have  also  been  identified.     A  complete 
genomic  structural  map  has  located  all  the  identified  genes 
of  AcMNPV   (Ayres  et  al . ,  1994). 

Regulation  of  baculovirus  gene  expression 

The  baculovirus  genes  are  transcribed  in  an  ordered 
cascade.     Four  types  of  genes    (immediate  early,   early,  late, 
and  very  late  genes)   have  been  described  according  to  their 
dependence  on  the  transcription  of  previous  types  cf  genes 
and  on  their  occurrence  before  or  after  viral  DNA 
replication   (Friesen  &  Miller,   1986;  Guarino  &  Summers, 
1986;  Blissard  &  Rohrmann,   1990) . 

The  immediate  early   (IE)   genes,   also  called  regulatory 
genes,   do  not  require  any  viral  gene  products  for  their 
transcription  and  are  involved  in  the  transactivat ion  of  the 
next  gene  expression  phase    (early  genes)    (Guarino  &  Summers, 
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1986;   Chisholm  Sc.Henner,    1988).     Examples  of  IE  genes 
include  the  IE-O,   IE-1,    IE-N,   PE-28  and  CG-30  genes  (Carson 
et  al . ,   1988;  Chisholm  &  Henner,    1988;  Guarino  &  Summers, 
1988) .     The  second  type  of  genes  are  called  the  early  genes 
and  are  involved  in  viral  DNA  replication.     RNA  polymerase 
II  is  believed  to  be  responsible  for  the  transcription  of 
early  genes    (Grula  et  al.,   1981;  Fuchs  et  al .  ,   1983).  The 
transcriptional  motif,   CAGT,   is  conserved  in  the  promoters 
of  both  immediate  early  and  early  genes    (Blissard  & 
Rohrmann,   1989;  Theilmann  &  Stewart,   1991;  Ayres  et  al . , 
1994)  . 

In  contrast  to  the  IE  and  early  genes,   the  late  and 
very  late  genes  are  transcribed  after  viral  DNA  replication, 
and  depend  on  the  expression  of  the  early  genes  (Miller, 
1988;  Thiem  &  Miller,   1989) .     RNA  polymerase  III  is  believed 
to  be  responsible  for  the  transcription  of  late  and  very 
late  genes    (Blissard  &  Rohrmann,   1990;   Zanotto  et  al . , 
1992) .     By  using  a  primer  extension  assay   (Rohrmann,  1986; 
Thiem  &  Miller,   1989) ,   a  common  motif  of  late  and  very  late 
genes   (TAAG)   has  been  proved  to  be  a  transcription  start 
site   (the  first  T  or  first  A) .     Most  of  the  late  and  very 
late  genes  code  for  structural  proteins  needed  for  the 
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assembly  of  baculovirus  virions  and  polyhedra  (Miller,  1988; 
Williams  et  al .  ,  1989). 

Application  of  Baculoviruses  in  Agriculture  and 
Biotechnology 

The  baculoviruses  are  mainly  used  as  microbial  control 
agents  against  insect  pests    (Huber,   1986) .     They  have  also 
been  developed  as  protein  expression  systems  in 
biotechnology   (Summers  &  Smith,   1987;  King  &  Possee,  1992; 
O'Reilly  et  al .  ,   1992;  Richardson,    1995;   Shuler  et  al., 
1995) .     Both  applications  represent  the  keystone  for 
studying  baculoviruses,   and  contribute  to  the  knowledge  of 
these  viruses. 

Use  of  baculoviruses  as  biological  control  agents 

Baculoviruses  can  infect  a  wide  range  of  insects 
including  34  families  of  Lepidoptera,   a  few  families  of 
Hymenoptera,   Diptera,   Coleoptera,   Neuroptera,  Trichoptera, 
Thysanura,   and  Siphonaptera   (Tanada  &  Kaya,    1993,  Murphy, 
1995) .     More  than  800  species  of  baculoviruses  have  been 
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reported  from  lepidopteran  and  dipteran  hosts.     They  have 
been  used  as  microbial  control  agents  for  decades  because  of 
their  host  specificity   (Hawtin  et  al .  ,   1992).     At  present, 
several  commercial  baculovirus  pesticides  are  registered 
(Huber,   1986) .     These  commercial  baculovirus  pesticides 
include  SeMNPV,   HzSNPV,   AcMNPV,   Anagrapha  falicfera  MNPV 
(AfMNPV) ,    Cydia  pomonella   (codling  moth)   GV   (Biosys  Inc.), 
LdMNPV,   and  NsSNPV   (U.S.   Forest  Service,   USDA) .     In  Brazil 
and  the  southern  United  States,   AgMNPV  has  been  used  to 
control  the  velvetbean  caterpillar,  Anticarsia  gemmatalis , 
in  soybean  crops   (Moscardi  &  Sosa-Gomez,    1993;   Funderburk  et 
ai . ,   1992) .     In  the  northern  regions  of  America,   LdMNPV  has 
been  successfully  used  to  control  the  forest  pest,  gypsy 
moth   (Huber,   1986)  .     There  are,   however  some  limitations  to 
the  use  of  baculoviruses ,   because  the  time  required  to  kill 
the  hosts  after  baculovirus  infection  is  often  too  long  (5 
to  10  days)   to  prevent  crop  losses.  Therefore, 
baculoviruses  are  only  suitable  for  those  crops  presenting 
certain  levels  of  tolerance  to  insect  damage    (Bonning  & 
Hammock,    1992) .     The  development  of  recombinant 
baculoviruses  with  integrated  toxin  genes  has  the  potential 
to  control  pests  more  efficiently   (Carbonell  et  al . ,  1988; 


15 

Bonning  &  Hammock,    1992) .     Some  of  the  genetically  improved 
baculovirus  insecticides  have  already  been  tested  in  the 
field   (Wood  &  Granados,   1991;   Cory  et  al . ,   1994).  The 
results  show  that  the  modified  baculoviruses  kill  insect 
pests  faster  than  wildtype  baculoviruses,   and  therefore 
could  reduce  crop  damage    (Maeda  et  al . ,   1991) .  Genetically 
engineered  baculoviruses  will  become  useful  to  control 
insect  pests  in  forests  and . agricultural  systems  in  the 
future   (Bonning  &  Hammock,   1992) .     However,   the  release  of 
recombinant  baculoviruses  to  the  natural  environment  is 
still  controversial    (Fuxa,    1989)  . 

Environmental  safety  is  a  main  issue  when  baculoviruses 
are  applied  as  biological  pesticides.     Several  species  of 
birds,   aquatic  organisms  and  mammals  have  been  tested  for 
toxicology  safety   (Betz,   1986)  ,   and  no  deleterious  effects 
have  yet  been  reported.     Beneficial  insects  were  also 
tested,   and  no  direct  adverse  effects  were  found  (Groner, 
1986) .     However,   some  parasite  and  predator  species  were 
indirectly  affected  by  baculoviruses  due  to  the  decrease  in 
host  larvae  resources   (Betz,  1986). 

The  persistence  of  baculoviruses  in  the  environment  has 
also  been  studied.     Several  environmental   factors  affect  the 
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distribution  and  persistence  of  baculoviruses .  These 
factors  include  ultraviolet  light    (UV) ,  rainfall, 
temperature,   pH  of  soil,   and  the  microenvironment  of  the 
plant  surface   (Bitton  et  al . ,   1987).     Several  techniques 
have  been  used  for  detecting,   tracing  and  identifying 
baculoviruses  in  the  field.     These  techniques  include 
microscopic  diagnosis   (Kaupp  &  Burke,   1984;  Traverner  & 
Connor,   1992)  ,   bioassay,   serological  assays  such  as  Enzyme 
Linked  Immunosorbent  Assay   (ELISA)    (Naser  &  Miltenburger , 
1982,    1983;   Webb  &  Shelton,    1990),    DNA  dot  blot 
hybridization   (Ward  et  al . ,   1987 ; . Keat ing  et  al . ,   1989)  and 
polymerase  chain  reaction   (PCR)    (Burand  et  al . ,   1992;  Moraes 
&  Maruniak,   1997)  .     The  latest  development  of  a  PCR 
technique  provides  a  convenient,    fast  and  accurate  way  to 
detect  and  identify  baculoviruses  in  their  natural 
environment    (Moraes  &  Maruniak,   1997) . 

Baculovirus  expression  system 


The  baculovirus  expression  system  was  developed  based 
on  the  understanding  of  the  baculovirus  life  cycle, 
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baculovirus  gene  regulation  and  baculovirus  genome 
structure.       The  original  transfer  vector  has  been  created 
by  using  the  polyhedrin  gene  region  and  the  polyhedrin  gene 
promoter  of  AcMNPV  to  carry  and  express  a  foreign  gene 
(Smith  et  al .  ,   1983) .     The  constructed  vector  DNA  is 
delivered  into  insect  cells  that  are  infected  with  the 
wildtype  baculovirus  to  produce  a  recombinant  virus.  A 
recombinant  virus  that  carries  the  foreign  gene  is  produced 
due  to  the  homologous  DNA  exchange  between  the  polyhedrin 
gene  regions  from  the  vector  and  the  wildtype  virus  DNA. 
This  exchange  interrupts  polyhedrin  gene  transcription  in 
the  recombinant  virus,   which  then  does  not  express  the 
polyhedrin  protein.     Therefore,   the  recombinant  virus  does 
not  form  the  polyhedra .   The  recombinant  virus  is  usually 
selected  by  the  expression  of  a  marker  gene  such  as  that 
coding  for  the  p-galactosidase  that  digests  the  substrate, 
5-bromo-4-cholor-3-indolyl -(3-D- galacto-pyranoside    (X-gal)  , 
to  form  blue  plaques    (Summers  &  Smith,    1987)  .  Currently, 
several  baculovirus  vectors  as  well  as  laboratory  manuals 
are  available   (Summers  &  Smith,   1987;   King  &  Possee,  1992; 
O'Reilly  et  al .  ,    1992;   Richardson,    1995;   Shuler  et  al . , 
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1995) .     Sophisticated  procedures  for  the  expression  of 
foreign  genes  and  subsequent  protein  purification  have  been 
well  established. 

The  benefit  of  using  the  baculovirus  expression  system 
includes  high  yields  and  protein  posttranslational 
modifications  that  are  similar  to  eukaryotic  systems,  such 
as  protein  glycosylation,   phosphorylation,   and  amidation 
(Luckow  &  Summers,   1988a;  Maeda,    1989) .     This  expression 
system  can  be  used  for  pharmaceutical  purposes,  insect 
physiology  studies  and  pest  control    (Maeda,    1989) . 

Future  Study  and  Prospects 

Evolutionary  studies  of  baculoviruses 

In  the  1960s  and  1970s,   the  study  of  phylogenetic 
relationships  using  a  molecular  approach  showed  tremendous 
progress,   mainly  through  the  use  of  various  techniques  such 
as  protein  electrophoresis,   DNA-DNA  hybridization, 
immunological  methods  and  protein  sequencing.  Statistical 
measurements  of  genetic  distances  and  methods  for 
reconstruction  of  phylogenetic  trees  have  also  been 
developed   (Li  &  Graur,    1991)  .     The  accumulation  of  DNA 


sequence  data  has  facilitated  phylogenetic  analysis. 
Molecular  evolutionary  data  could  potentially  be  used  to 
interpret  the  relationships  among  baculoviruses  and  to  other 
viruses.     The  evolution  of  DNA  viruses  is  usually  caused  by 
modifications  of  their  genomes  due  to  DNA  deletion,  DNA 
recombination   (gene  rearrangement) ,   and  DNA  insertion  from 
the  host  genome.     Several  baculovirus  genes  show  homology 
with  the  host  cell  genes  such  as  ubiquitin   (van  Strien  et 
al .  ,   1996),   and  such  data  support  the  evolutionary  mechanism 
of  incorporating  of  host  cell  DNA  into  the  viral  genome. 

The  baculovirus  polyhedrin  gene  has  been  used  to 
reconstruct  a  phylogenetic  tree,   showing  the  early 
divergence  of  NPVs  and  GVs    (Zanotto  et  al . ,    1993).  The 
results  showed  that  the  hymenopteran  NPV  diverged  earlier 
from  the  lepidopteran  NPVs  than  from  the  lepidopteran  GVs. 
The  data  also  suggested  that  the  lepidopteran  NPVs  were 
divided  into  two  major  branches.     Until  1996,  three 
baculovirus  genes  have  been  used  to  reconstruct  the 
phylogenetic  trees  including  the  polyhedrin  gene,  DNA 
polymerase   (Ahrens  &  Rohrmann,    1996;   Pellock  et  al . ,  1996) 
and  ecdysteroid  UDP-glucosyltransf erase   (Barrett  et  al . , 
1995) .     The  results  of  the  last  two  gene  phylogenetic  trees 
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supported  the  hypothesis  generated  from  the  phylogenetic 
tree  of  polyhedrin  genes . 

DNA  polymerase  genes  have  been  classified  into  four 
families  including  A,   B,   C,   and  X   (Heringa  &  Argos ,    1994) . 
The  baculovirus  DNA  polymerase  belongs  to  family  B,   which  is 
also  the  type  of  polymerase  found  in  various  other  species 
ranging  from  bacteria,   viruses,   yeasts  and  mammals  (Heringa 
&  Argos,   1994) .     By  comparing  the  nucleotide  sequence  of  the 
AcMNPV  DNA  polymerase  gene  with  those  from  two  other  insect 
DNA  viruses,   the  ascovirus  and  entomopoxvirus ,   it  was 
concluded  that  they  have  independent  evolutionary  paths 
(Pellock  et  al . ,    1996) . 

Moreover,   baculovirus  egt  genes  were  used  to  study 
their  phylogenetic  relationships.     The  egt  proteins  range 
from  55  to  60  kDa   (O'Reilly  &  Miller,    1990;  Riegel  et  al . , 
1994) ,   and  catalyze  the  transfer  of  glucose  to  ecdysteroids 
(O'Reilly  &  Miller,   1989)  .     The  molting  and  pupation  of 
infected  insect  larvae  have  been  shown  to  be  blocked  because 
of  an  imbalance  in  this  insect  hormone   (O'Reilly  &  Miller, 
1989) .     Deletion  of  the  egt  gene  can  speed  the  killing  time 
of  insect  larvae  by  AcMNPV   (O'Reilly  &  Miller,    1991) . 
However,   histopathological  investigation  showed  that  the 


degeneration  of  Malpighian  tubules  causes  the  death  more 
rapidly  in  these  insect  larvae  that  were  infected  by  an 
AcMNPV  egt  gene  deletion  mutant    (Flipsen  et  al . ,   1995)  .  A 
baculovirus  pesticide  improvement  is  suggested  by  deletion 
of  the  egt  gene    (O'Reilly  &  Miller,   1991) .     The  egt  proteins 
also  share  21  to  22%  amino  acid  sequence  identities  with 
several  mammalian  UDP-glucuronosyl  transferases   (O'Reilly  & 
Miller,   1989) .     Overall,   the  phylogenetic  analysis  of  the 
egt  genes  from  six  different  baculoviruses  supports  the 
evolutionary  scheme  of  the  polyhedrin  sequence  phylogeny 
tree   (Barrett  et  al.,  1995). 

The  reconstruction  of  a  baculovirus  phylogenetic  tree, 
based  on  other  baculovirus  genes  such  as  gp41,   gp64  and  plQ 
will  provide  additional  information  for  examining  the 
evolutionary  hypothesis  based  on  the  polyhedrin  phylogenetic 
tree.     Also,   the  non-protein  coding  sequences  of 
baculoviruses  could  provide  useful  information  for 
understanding  baculovirus  phylogeny.     For  instance,  the 
divergence  and  evolution  of  homologous  regions    (HR)  between 
AcMNPV  and  BmMNPV  have  been  studied,   and  results  have  shown 
that  the  HRs  of  AcMNPV  and  BmMNPV  are  highly  conserved 
(Majima  et  al . ,   1993).     However,   the  high  variability  of  the 
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HR  sequences  between  genomic  variants  of  the  same  virus 
(Garcia-Maruniak  et  al . ,    1996),   and  the  facts  that  there  are 
four  to  eight  HR  regions  in  the  genome  of  different 
baculoviruses ,   cause  a  problem  in  analyzing  the  data. 

Bioinf ormat ic  study 

Recently,   the  rapid  development  of  genomic  projects 
including  the  mapping  of  bacterial    {Escherichia  coli)  ,  yeast 

(Saccharomyces  cerevisiae) ,   nematode  (Caenorhabditis 
elegans) ,   fruit  fly   (Drosophila  melanogaster) ,   and  human 

(Homo  sapiens)   genomes  created  a  new  field  called 
bioinf ormatics    (Schomburg  &  Lessel,   1995;   Schulze-Kremer , 
1996) .     Using  computer  programs  and  macromolecular 
databases,   scientists  are  able  to  evaluate  the  potential 
biological  function  of  a  newly  detected  gene  and  the 
phylogenetic  relationship  to  other  genes.     A  complete  search 
of  the  homologous  sequences  in  the  databanks  not  only 
provides  the  data  to  reconstruct  a  phylogenetic  tree  between 
the  unknown  protein  and  the  homologous  proteins,   but  also 
provides  the  structural  backbone  to  build  a  possible  three- 
dimensional    (3D)    image  of  the  unknown  protein  (Benner, 
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1995) .     Two  major  databases,   the  GenBank  at  the  National 
Center  for  Biotechnology   (NCBI,   USA)   and  the  EMBL  (European 
Molecular  Biology  Laboratory  Database)   at  the  European 
Bioinf ormatics  Institutes    (EBI,   England)   are  accessible 
around  the  world   (Doolittle,   1996) ,   providing  information  on 
nucleotide  and  primary  amino  acid  sequences.     In  addition, 
the  protein  data  bank   (PDB) ,   a  protein  structure  database, 
collects  protein  structure  information  from  crystal lographic 
results,   and  is  therefore  an  important  database  for 
constructing  3D  structures  of  unknown  proteins.  The 
development  of  such  databases,   computer  programs,  and 
computer  facilities  provides  scientists  with  more  efficient 
ways  to  search  for  homologous  sequences  of  an  unknown  gene, 
to  align  multiple  sequences,   and  to  reconstruct  phylogenetic 
relationships . 

Present  study 

In  this  study,   the  baculovirus  gp41  gene  was  chosen  for 
phylogenetic  analysis,   because  it  has  been  proved  to  be 
highly  conserved   (Brown  et  al . ,   1985;  Liu  &  Maruniak,  1995). 
Two  new  gp41  gene  DNA  sequences  of  AgMNPV  and  SfMNPV  were 
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generated  and  compared  with  other  known  gp41  genes.  The 
secondary  structure  and  possible  functional  domains  of  the 
gp41  genes  were  predicted  using  several  computer  programs. 
Genomic  regions  of  the  gp41  gene  from  different 
baculoviruses  were  compared  in  order  to  better  understand 
the  evolutionary  relationships  among  these  viruses.  The 
phylogenetic  tree  of  baculoviruses  was  reconstructed  based 
on  several  phylogenetic  trees  of  baculovirus  genes  so  that 
the  present  baculovirus  evolutionary  hypotheses  could  be 
examined.     Insect  hosts  of  baculoviruses  were  also  studied 
in  order  to  reveal  the  evolutionary  relationship  between 
baculoviruses  and  their  hosts. 

This  study  will  not  only  contribute  to  an  understanding 
of  the  evolutionary  relationships  among  baculoviruses,  but 
also  could  be  used  as  a  reference  to  choose  baculoviruses 
for  developing  recombinant  baculoviruses.     Since  recombinant 
baculovirus  techniques  depend  on  the  homology  of  the 
baculovirus  DNA  genome,   the  phylogenetic  tree  could  be  used 
as  a  phenetic  tree  to  indicate  homologous  relationships 
among  the  viruses.     Eventually,   this  study  will  benefit 
research  involving  both  the  basic  molecular  evolution 
analysis  and  the  practical  application  of  baculoviruses. 


CHAPTER  2 

NUCLEOTIDE  SEQUENCE  AND  TRANSCRIPTIONAL  ANALYSIS  OF  THE  GP41 
GENE  OF  Spodoptera  frugiperda  NUCLEAR  POLYHEDROSIS  VIRUS 


Introduction 


Spodoptera  frugiperda  MNPV   (SfMNPV-2)    is  a  member  of 
the  family  Baculoviridae .     SfMNPV-2  has  a  double -stranded 
DNA  genome  of  approximately  121  kb .     The  SfMNPV  physical  map 
for  a  number  of  restriction  endonucleases  has  been 
described,   and  the  restriction  endonuclease  profiles  also 
shows  differences  comparing  to  other  NPVs   (Loh  et  al . ,  1981; 
Maruniak  et  al . ,   1984).     However,   two  regions  of  DNA 
homology  on  the  physical  maps  of  SfMNPV-2  and  S.  exempta 
MNPV   (SeMNPV-25) ,   an  Autographa  calif ornica  MNPV  genomic 
variant    (Brown  et  al . ,   1985),   have  been  identified  by 
hybridization  under  high  stringency  conditions.     One  of 
these  two  regions  contained  the  polyhedrin  gene   (Brown  et 
al .  ,   1987);   the  other  region  has  been  identified  in  the 
current  report  to  be  associated  with  the  gp41  structural 
protein  gene. 
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Two  types  of  virions  are  produced  during  the  nuclear 
polyhedrosis  life  cycle.     Those  virions  found  within  the 
viral  inclusion  bodies    (IBs)   are  termed  occluded  viruses 
(OVs) .     They  obtain  their  envelope  in  the  nuclei  of  infected 
cells  de  novo,  and  the  OV  envelope  is  involved  in  the 
recognition  of  host  microvilli  during  infection.     The  second 
type  of  baculovirus  virion  is  the  budded  virus   (BV) .  The 
single  nucleocapsids  bud  through  the  plasma  membrane  of 
infected  cells  and  form  the  ECV   (Granados  &  Williams,  1986; 
Blissard  &  Rohrmann,   1990)  .     These  virions  appear  to  be 
specialized  for  secondary  infection  of  other  host  cells  and 
contain  virus-encoded  envelope  glycoproteins  which  are 
involved  in  host  cell  infection,   i.e.   gp64    (Maruniak,  1979; 
Keddie  &  Volkman,   1985) . 

The  gp41  structural  protein  has  been  identified  as  a 
major  OV  glycoprotein  by  metabolic  labeling   (Maruniak  1979; 
Stiles  &  Wood,    1983) .     It  has  also  been  detected  by  the 
binding  of  horseradish  peroxidase-linked  concanavalin  A, 
thus  indicating  it  is  glycosylated   (Braunagel  &  Summers, 
1994) .     Furthermore,   an  O- linked  single  N-acetylglucosamine 
covalently  bonded  to  the  polypeptide  was  identified 
(Whitford  &  Faulkner,    1992a) .     Experiments  with  monoclonal 


antibodies  indicated  that  gp41  is  present  only  in  OV;  it 
appears  to  be  associated  with  OV  but  not  with  purified 
nucleocapsids  or  the  ECV   (Whitford  &  Faulkner,   1992a;  Ma  et 
al .  ,   1993)  .     The  location  of  the  gp41  protein  has  been 
predicted  to  be  between  the  envelope  membrane  and  the  capsid 
(tegument)   of  the  OV.     On  the  other  hand,   Braunagel  & 
Summers   (1994)    indicated  that  the  viral  proteins  of  40-41 
kDa  are  glycosylated  in  the  OV  and  ECV.     However,  the 
monoclonal  antibody  data  suggest  that  the  gp41  proteins  of 
ECV  and  OV  are  different  proteins.     The  gene  encoding  the 
gp41  protein  has  been  characterized   (Nagamine  et  al .  ,  1991; 
Whitford  &  Faulkner,    1992b;  Ma  et  al . ,   1993;  Ayres  et  al . , 
1994;   Kool  et  al . ,    1994),   but  the  biological  function  of  the 
gp41  protein  is  still  unknown. 

In  this  chapter,   the  complete  nucleotide  and  translated 
amino  acid  sequence  of  the  SfMNPV-2  gp41  gene  is  presented. 
The  sequences  were  compared  with  other  known  gp41  gene 
sequences  of  different  baculoviruses  to  reveal  the  possible 
functional  domain  of  the  gp41  protein.     A  possible 
transcriptional  regulation  mechanism  and  the  phylogenetic 
relationships  of  the  gp41  gene  among  the  different 
baculoviruses  are  discussed  in  this  paper. 
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Methods 

Virus  and  Cell  Culture 

The  S.   frugiperda  MNPV  isolate  SfMNPV-2    (Maruniak  et 
al .  ,   1984)   was  propagated  in  the  S.   frugiperda  Sf-9  cell 
line   (Luckow  and  Summers',    1988b)  .     Sf-9  cells  were 
maintained  at  27°C  in  TC-100  medium  supplemented  with  10% 
fetal  bovine  serum   (Life  Technology)   and  50  /xg/ml 
gentamicin . 

DNA  Cloning  and  Sequencing 

The  SfMNPV-2  EcoRI-S  DNA  fragment  was  cloned  into 
pGEM3Z  and  pGEM7Zf(+)   vectors   (Promega  Corp.),   and  the 
subfragments  EcoRI-Hindlll    (0.5  kbp) ,   EcoRI-PstI    (0.8  kbp) , 
Pstl-EcoRI    (1.1  kbp)   and  Hhal-Hhal    (0.7  kbp)   were  cloned 
into  pGEM3Z.     Exonuclease  digested  subclones  were  generated 
with  the  Erase-a-Base  system   (Promega  Corp.).  A 
modification  of  the  experimental  protocol  was  made  to 
precipitate  the  exo-nuclease-digested  DNA  before  the  next 
step  of  DNA  ligation,   because  an  incomplete  inhibition  of 


exo-nuclease  was  found  when  the  manufacturer's  instructions 
were  followed.     The  extra  DNA  precipitation  step  was 
introduced  between  the  SI  nuclease  digestion  and  Klenow 
enzyme  treatment.     Sequencing  was  performed  by  the 
dideoxynucleotide  chain  terminator  sequencing  method  (Sanger 
et  al .  ,    1977)   with  Sequenase    (United  States  Biochemical 
Corp.) .     The  oligonucleotide  primers  were  synthesized  by  the 
DNA  Synthesis  Laboratory  of  the  Interdisciplinary  Center  for 
Biotechnology  Research  at  the  University  of  Florida. 

Computer  Analysis 

The  Wisconsin  Sequence  Analysis  Package™   (Version  8.1, 
VMS;  Genetic  Computer  Group)   was  used  for  comparing  the 
nucleotide  sequence  and  amino  acid  sequence  identities 
(GAP) ,   generating  the  multiple  sequence  alignment    (Pileup) , 
and  plotting  the  hydrophobic ity  profile   (Pepplot) .  The 
Blast  program   (Altschul  et  al . ,    1990)   was  used  to  search  the 
GenBank  databank  for  the  homologous  nucleotide  sequences 
through  the  e-mail  service  at  the  National  Center  for 
Biotechnology  Information   (NCBI,   USA) .     The  Fetch  program 
was  used  to  retrieve  nucleotide  sequences  from  the  local 
GenBank  database . 


RNA  Purification 

The  total  cellular  RNA  was  isolated  using  the  guanidine 
isothiocyanate  method   (Ausubel  et  al . ,   1989)    from  3xl06  Sf-9 
cells  infected  with  SfMNPV-2  at  a  multiplicity  of  infection 
of  10  plaque  forming  units    (PFU)   per  cell.     At  various  times 
postinfection   (p.i.),   the  cells  were  lysed  in  4  M  guanidine 
isothiocyanate  pH  5.5,   20  mM  sodium  acetate,    0.1  mM 
dithiotheitol    (DTT)   and  0.5%  sarkosyl .     Cell  lysates  were 
layered  over  a  5.7  M  CsCl  solution   (0.1  mM  EDTA)  and 
centrifuged  at  10  0k  X  g  for  24  hours  in  a  swinging  bucket 
AH650  rotor   (DuPont) .     The  RNA  was  dissolved  in  sterile 
water  and  ethanol  precipitated.     After  washing  the  RNA 
pellet  in  70%   (v/v)   ethanol,   the  pellet  was  dissolved  in 
sterile  water.     The  RNA  concentration  was  determined  by 
measuring  the  UV  absorbance  at  260  nm   (OD2go  x  40  =  Hg/ml) . 

Northern  Blot  Hybridization 

A  total  of  5  /ig  RNA  was  denatured  with  7%  formaldehyde, 
50%  formamide  and  IX  MOPS  buffer   (0.2  M  MOPS  pH  7.0,   50  mM 
sodium  acetate  and  10  mM  EDTA)   at  55°C  for  15  min.  Before 


31 

electrophoresis,   0.1  volume  of  10X  loading  buffer  (20% 
Ficoll  400,    1%  SDS,   0.1  mM  EDTA,    0.25%  Bromophenol  Blue  and 
Xylene  Cyanol  FF)   was  added.     Total  RNA  was  electrophoresed 
in  a  1%  agarose  gel    (1%  formaldehyde  and  IX  MOPS  buffer)  in 
IX  MOPS  buffer   (Maniatis  et  al . ,    1989).     The  separated  RNAs 
were  transferred  to  a  Zeta- Probe  blotting  membrane  (Bio-Rad 
Laboratories,   Inc.)   with  20X  SSC  buffer   (Maniatis  et  al . , 
1989) .     After  transfer,   the  membrane  was  air  dried  and  baked 
at  80°C  for  1  h.     The  DNA  probe  containing  50  ng  of  the 
SfMNPV-2  EcoRI-S  DNA  fragment  was  prepared  by  the  nick 
translation  method   (United  States  Biochemical  Corp.)  using 
30  fiCi    [a-32P]dCTP   (3000  mCi/mmole)  .     Hybridization  was 
done  overnight  at  42°C,   and  the  blot  was  rinsed  at  42°C  with 
5%  and  1%  SDS  washing  buffer  twice  each   (40  mM  NaHP04  pH 
7.2,   1  mM  EDTA)   as  described  by  the  manufacturer  (Bio-Rad 
Laboratories,    Inc.).     The  blot  was  exposed  with  Kodak  X-OMAT 
film. 

Primer  Extension 


A  total  of  10  pig  RNA,    isolated  from  the  infected  Sf-9 
cells,   was  mixed  with  0.5  /xg  of  20-mer  oligonucleotide 
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primer   (5 1 -GACGTAATCGACACATTTGT-3 1 ) .     This  primer  was 
complementary  to  the  region  from  104  to  123  bases  downstream 
of  the  translation  start  codon  of  the  SfMNPV-2  gp41  protein 
gene.     The  RNA  and  the  primer  were  incubated  at  3  0°C 
overnight.     The  extension  reaction  was  done  in  buffer 
containing  50  mM  Tris-HCl,   pH  8 . 3 ,    75  mM  KCl,    3  mM  MgCl2,  10 
mM  DTT,   0.12  mM  of  each  deoxyribonucleotide  triphosphate,  25 
/xCi    [a-32P]dCTP   (3000  mCi/mmol)   and  200  units  of  Maloney 
murine  leukemia  virus  reverse  transcriptase  (Life 
Technology)    for  60  min  at  37°C   (modified  from  Ausubel  et 
al.,   1989) .     The  reaction  was  stopped  by  adding  EDTA  to  a 
final  concentration  of  20  mM.     The  extension  products  were 
ethanol-precipitated  and  resolved  on  a  6%  polyacrylamide 
sequencing  gel .     A  sequence  marker  was  done  with 
dideoxynucleotide  chain  terminator  sequencing  reaction  by 
using  the  same  primer  with  a  DNA  template  containing  the 
SfMNPV-2  EcoRI-S  fragment. 
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Results 

Cloning  and  Sequencing  of  the  S.   frugiperda  EcoRI-S  Fragment 

The  S.   frugiperda  MNPV-2  EcoRI-S  fragment  containing 
the  gp41  structural  protein  gene  was  cloned  into  pGEM3Z  and 
pGEM7Zf(+)    (Fig.   2.1  A).     The  specific  restriction 
endonuclease  digested  subclones  and  exonuclease  III  deleted 
subclones  were  constructed.     The  T7  and  SP6  promoter  primers 
present  in  the  pGEM  vector  and  several  specific 
oligonucleotide  primers  were  used  for  sequencing   (Fig.  2.1 
B) .     A  major  open  reading  frame   (ORF)   which  contained  999 
nucleotides  encoded  the  gp41  gene,   and  it  was  oriented  from 
right  to  left  according  to  the  conventional  physical  maps 
(Fig.   2.1  B)    (Maruniak  et  al . ,   1984).     The  complete  sequence 
of  Sf MNPV-2  EcoRI-S  fragment    (Appendix  A)   was  deposited  with 
the  GenBank  Data  Library.     One  baculovirus  late  promoter 
consensus  motif  TAAG   (Blissard  &  Rohrmann,   1990)   was  found 
from  3  9  to  43  nucleotides  upstream  from  the  ATG  translation 
start  codon.     The  translation  stop  codon  TGA  was  followed  by 
3  94  nucleotides  downstream  to  the  polyadenylat ion  signal 
AATAA. 
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Figure  2.1.     Position  of  the  gp41  gene  on  the  SfMNPV  genomic 
map  and  sequencing  strategy.    (A)   EcoRI  restriction  map  of 
the  SfMNPV-2  genome   (Maruniak  et  al . ,   1984).    (B)  Detailed 
physical  map  of  EcoRI-S  fragment.   The  gp41  999  bp  open 
reading  frame  is  indicated  by  the  bold  arrow  under  the  map. 
The  small  arrows  below  the  map  indicate  the  extension  and 
direction  of  the  sequence  using  T7  or  SP6  primers  or 
specific  primers  indicated  by  an  asterisk. 
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Transcriptional  Analysis  of  the  GP41  Gene 

Northern  blot  analysis  of  total  RNA  from  infected  cells 
isolated  from  3  to  48  h  p. i.    is  shown   (Fig.   2.2) .     Two  mRNAs 
of  approximately  1.6  and  2.8  kbp  were  detected  after  12  h 
p.i.   and  remained  detectable  at  4  8  h  p.i.  when  the  SfMNPV 
EcoRI-S  fragment  containing  the  gp41  coding  region  was  used 
as  a  probe . 

Primer  extension  analysis  was  used  to  identify  the 
transcription  start  site.     A  20-mer  oligonucleotide, 
corresponding  to  the  complement  region  of  the  coding 
sequence  from  nucleotides  104  to  123,   was  used.  Three 
transcription  start  sites  were  located   (Fig.   2.3).     Two  of 
the  transcription  start  sites  were  located  at  -42  and  -41 
nucleotides  from  the  ATG  translation  start  codon  within  the 
first  T  and  second  A  of  the  TAAG  consensus  motif    (Fig.   2.3) . 
Another  transcriptional  start  site  was  located  at  nucleotide 
-14  0  from  the  ATG  start  codon  for  which  no  consensus  motif 
has  been  determined    (Fig.  2.3). 
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Figure  2.2.     Northern  blot  analysis  of  gp41  gene 
transcripts.   Total  RNA  was  extracted  from  uninfected  Sf-9 
cells    (lane  2;   C,   the  uninfected  cell  control)   and  SfMNPV 
infected  Sf-9  cells  at  3,    6,    12,   24  and  48  p.i.    (lane  3  to  7 
respectively) .     The  gp41  gene  transcripts  were  detected  with 
a  32P-labeled  SfMNPV  EcoRI-S  DNA  fragment.     1  kb  ladder 
standard   (in  kilobase)    is  shown  on  the  left  side  of  the  blot 
(Lane  1;  M,   the  size  marker) . 
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Figure  2.3.     Primer  extension  analysis  of  gp41  gene 
transcripts.  Total  RNA  extracted  from  SfMNPV  infected  Sf-9 
cells  at  48  hr  p.i.  was  mixed  with    the  primer  5 ' - 
GACGTAATCGACACATTTGT-3 1 .     The  cDNAs  were  synthesized  using 
Maloney  murine  leukemia  virus  reverse  transcriptase  and  were 
separated  on  a  6%  sequence  gel.     Three  transcription  start 
sites  were  identified   (lane  5;   P,   the  primer  extension 
product) .     The  TA  transcription  start  sites  were  within  the 
TAAG  motif.     The  upper  T  transcription  start  site  was  not 
associated  with  any  known  motif.     The  complementary  sequence 
ladder  is  shown  on  the  left  side  as  the  sequence  order  G,  A, 
T  and  C. 
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Amino  Acid  and  Nucleotide  Sequence  Comparison  of  SfMNPV-2 

with  Other  Baculoviruses 

The  amino  acid  and  nucleotide  sequences  of  the  S. 
frugiperda  gp41  gene  were  compared  with  three  other  NPV  gp41 
genes  including  A.   californica  MNPV   (AcMNPV-E2),   Bombyx  mori 
MNPV   (BmMNPV)   and  Helicoverpa  zea  SNPV   (HzSNPV)    (Table  2.1). 


Table  2.1.  Amino  acid  sequence  similarities  and  nucleotide 
sequence  identities    (%)   of  gp41  structural  protein*. 


BmMNPV 

HzSNPV 

SfMNPV-2 

AcMNPV-E2 

96 

75 

72 

(96) 

(60) 

(59) 

BmMNPV 

75 

74 

(59) 

(59) 

HzSNPV 

76 

(62) 

JL 

Bold  and  normal  lettering  in  parentheses  denote  amino  acid 
sequence  similarities  and  nucleotide  sequence  identities, 
respectively. 


At  the  nucleotide  level,   the  sequences  of  the  NPVs  had  an 
average  of  60%  identity  among  them  except  for  AcMNPV-E2  and 
BmMNPV  which  shared  a  much  higher  identity   (97%)  .  However, 
at  the  amino  acid  level,   the  predicted  polypeptide  sequences 
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were  more  conserved   (70%  similarity) .     Kyte-Doolitt le  (1982) 
and  Goldman   (reviewed  by  Engelman  et  al . ,    1986)  analyses 
were  performed  to  compare  the  distributions  of  hydrophilic 
and  hydrophobic  domains  among  the  four  NPV  proteins. 
AcMNPV-E2  and  BmMNPV  had  almost  identical  hydrophobicity 
patterns,   while  SfMNPV-2  and  HzSNPV  showed  a  similar 
hydrophobicity  pattern  overall   (Fig.   2.4).     In  general,  the 
hydrophobic  profiles  of  all  four  NPVs  were  similar  within 
amino  acids  100  to  34  0  of  AcMNPV-E2  and  BmMNPV  and  amino 
acids  40  to  280  of  SfMNPV-2  and  HzSNPV   (Fig.   2.4).  The 
predicted  amino  acid  sequences  of  all  four  NPVs  were 
compared  to  show  the  conserved  regions   (Fig.   2.5).  Sixteen 
conserved  regions    (defined  as  more  than  three  contiguous 
amino  acids  being  the  same)   were  found  within  the  whole 
sequence  alignment.     Within  the  50  to  350  amino  acid 
comparison  region,   9  of  14  prolines  were  conserved  among  the 
NPVs. 

In  addition  to  the  comparison  of  amino  acid  sequences, 
the  nucleotide  sequences  of  the  upstream  region  from  the  ATG 
translation  codon  of  the  four  NPVs  were  compared.  The 
sequence  alignment  around  the  late  gene  transcriptional 
consensus  motif  from  -52  to  -46  nucleotides  of  all  four  NPVs 
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Figure  2.4.     Comparison  of  hydrophilic-hydrophobic  profiles 
among  the  homologous  gp41  proteins.     The  solid  line  is  done 
by  Kyte-Doolittle   (1982)   analysis  and  the  dash  line  is  done 
by  Goldman  et  al .    (review  by  Engelman  et  al . ,  1986) 
analysis . 
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1  60 
AcMNPV-E2     MTDERGNFYY  NT-PPPLRYP  SNPATAIFTS  AQTY-NAPGY  VPPATVPTTV  ATRDNRMDYT 

AcMNPV-HR3   -  -  

BmMNPV   P  N   .  .  .  .  N     K.  .  -  . 

HzSNPV             MS 

SfMNPV-2             MAN.. 

61  120 
ACMNPV-E2     SRSNSTNSVA  IAPYNKS - KE  PTLDAGESIW  YNKCVDFVQK  I IRYYRCNDM  SELSPLMILF 

ACMNPV-HR3   -  

BmMNPV   -  H  . 

HzSNPV  LPHAV.TALQ  HQQHQ.QLQ.   SSS . .  .    T  Y.ER   ...F..T...  .H.T.Q..ML 

SfMNPV-2       RPNSI.K.--   STMSSS . LSS  SSSA.ITEP.   MD....Y.N.    .V....T...    . Q . T . Q . LNL 

121  180 
AcMNPV-E2     INTIRDMCID  TNPISVNWK  RFESEETMIR  HLIRLQKELG  QSNAAESLSS  DSNIFQPSFV 

AcMNPV-HR3   

BmMNPV   N  G  P  A... 

HzSNPV   L.VE  SH  D  .  D  .  NL  .  K  HYS  .  .  R  .  .  .  .    G.EV.   -E  

SfMNPV-2   NV..E    .Y.VD.  .AT.    .  .  D  .  DVNLMN  NYK   NKPIT  -.D..KA... 

181  240 
AcMNPV- E2     LNSLPAYAQK  FYNGGADMLG  KDALAEAAKQ  LSLAVQYMVA  EAVTCNIPIP  LPFNQQIANN 

ACMNPV-HR3   

BmMNPV   S  

HzSNPV  Y.V..S  K.  .ENVS  G.SVS.  .  .HE    .GE.L..QI.    ...AS.T  VRH..V.T 

SfMNPV-2       YSV..S  K.G.H.A  SGSVE  .  .  .RH   .GY.L..QI.   Q...T.T  D  D 

241  300 

AcMNPV- E2     YMTLLLKHAT  LPPNIQSAVE  S  RRFPH  INMINDLINA  VIDDLFAGG-  GDYYHYVLNE 

ACMNPV-HR3     -   

BmMNPV     -   

HzSNPV  . I....QR.N  I...V.D..S    .  .KY.T  L.I  N    ....V.T.VY  .N..Y  

SfMNPV-2        .L....QR.N  I.T...EIIN    . GNRTHGNSR  VH...A...N   -  S...L  

301  360 
AcMNPV- E2     KNRARVMSLK  ENVAFLAPLS  AS AN I FNYMA  ELATRAGKQP  SMFQNATFLT  S AANAVNS PA 

AcMNPV- HR3    

BmMNPV   I  p  

HzSNPV   IVT.  .    ..IG  TD..Q.I.   N  R.    .  L  .  .  G .  .  .  .  N  APSS- - .  GSN 

SfMNPV-2       T.KS.IL  ISYM  TT....FI.   T...NS..K.    .V..S.SM..  MPLT--KPV- 

361  418 
AcMNPV- E2     AHLTKSACQE  SLTELAFQNE  TLRRFIFQQI  NYNKDANAI I  AAAAPNATRP  NTKGRTA* 

AcMNPV- HR3  .R.IRRP  LI*          

BmMNPV   .R.IRLP  LI*          

HzSNPV  VEQNRTS . . Q    A...Y...KL  S.KQNY*      

SfMNPV-2       VSES.NV..Q  Q  E.  .  A  L  S.KN.ISQL*     


Figure  2.5.     Comparison  of  the  amino  acid  sequence  of  four 
NPV  gp41  proteins.     The  one-letter  code  designation  is  used. 
The  hyphens  denote  the  gap  filled  by  the  computer  program. 
The  dots  denote  identical  amino  acids.     The  abbreviation  for 
the  viruses  are  described  in  the  text. 
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was  identical    (Fig.   2.6).     Another  late  gene  transcriptional 
motif  from  -20  to  -17  was  identified  in  AcMNPV-E2  and 
BmMNPV;   however,    this  consensus  region  of  SfMNPV  and  HzSNPV 
was  changed  by  one  or  two  nucleotides. 


-99 


-40 


AcMNPV- 
BmMNPV 
HzMNPV 
SfMNPV 


E2     TAATTTTGTT  AATTTTATTA  TCGCTTTTTT  GTCACAACAA  CTATATTATA  AGTAATCCGT 

.C...A.A..    C  GA.  .    .TAT.G.A.G  TGA   T  G.^  G....  A 

.T  CG.    ...GA....C   .  T  .  .  .  A .  A .  C  TAA .  .  .  .  T  .  .   T  G.^   AA 


-39  1 
AcMNPV- E2     ATATTGAGTT  TTGTAATCAT  AAGAGTACAA  ATAAAAAGTA  TG 

BmMNPV  G   .  .  A  TG 

HzMNPV  CG. .AA.T.A  C. . .CCA. .C   . . ATTG . T . .    . . T . T .  A  TG 

SfMNPV  .A....TT.A  C..CCC  A.AACAC.   A  TG 


Figure  2.6.     Computer  alignment  of  the  DNA  sequence  flanking 
the  gp41  structural  protein  genes  of  AcMNPV-E2,  BmMNPV, 
HzSNPV  and  3fMNPV-2.     The  TAAG  consensus  sequences  are 
underlined  or  double  underlined.     The  translation  start 
codon  ATG  sites  are  denoted  in  bold  and  italic  letters. 


Discussion 


A  unique  feature  of  the  NPV  life  cycle  is  the 
production  of  two  virion  phenotypes :   the  occluded  virion 
(OV)   and  extracellular  virus   (ECV) .     The  biophysical, 
biochemical  and  morphological  characteristics  between  the  OV 
and  ECV  are  quite  different.     These  structural  differences 


may  play  a  functional  role  in  their  biological  properties. 
During  the  viral  infection,   one  of  the  virus-encoded 
envelope  glycoproteins,   gp64 ,    is  expressed  and  involved  in 
the  host  cell  infection.     The  gp64  protein  is  a  component  of 
the  virion  peplomers  which  are  only  detected  in  the  ECV  and 
are  essential  for  entry  of  ECV  into  the  cells  by  adsorptive 
endocytosis    (Keddie  &  Volkman,   1985).     In  contrast  to  gp64, 
gp41  is  only  associated  with  OV.     The  gp41  structural 
protein  was  found  exclusively  in  enveloped  OV  but  not  in 
either  ECV  or  enveloped  stripped  OVs    (Whitofrd  &  Falunker, 
1992a) .     Currently,   the  biological  function  of  gp41  is  not 
known,   but  gp41  may  be  involved  in  facilitating  the 
occlusion  of  virions  in  the  polyhedra  or  the  infection  of 
host  midgut  cells  according  to  their  biochemical 
characteristics . 

In  this  study,   we  presented  the  nucleotide  sequence  and 
transcriptional  analysis  of  the  SfMNPV-2  gp41  gene.  The 
nucleotide  sequence  of  the  SfMNPV-2  gp41  gene  shows  a 
different  degree  of  homology  with  the  three  other  NPVs 
including  AcMNPV-E2,   BmMNPV  and  HzSNPV   (Table  1) .  The 
nucleotide  sequence  identities  of  SfMNPV-2  and  the  other 
NPVs  were  low   (60%) .     Similar  results  have  been  reported 
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when  the  DNA  homology  was  compared  among  four  different 
Spodoptera  sp .    including  S.   exempts,   S.   exigua,  S. 
frugiperda  and  S.   littoralis .     SfMNPV  is  considered 
distantly  related   (20-30%,   reassociation  kinetics)  among 
those  NPVs   (Kelly,    1977) .     The  molecular  biology  approach 
based  on  the  polyhedrin  gene  phylogenetic  tree  also 
suggested  that  the  SfMNPV  is  distantly  grouped  from  the 
AcMNPV  and  BmMNPV   (Zanotto  et  al . ,    1993).     The  results 
showed  that  the  SfMNPV  diverged  earlier  from  these  other 
NPVs,   whereas  the  DNA  homology  of  the  gp41  gene  of  AcMNPV 
and  BmMNPV  is  almost  identical    (97%) .     Comparing  these 
results  to  those  found  in  the  polyhedrin  gene  analysis 
suggests  that  AcMNPV  and  BmMNPV  are  very  closely  related 
species    (Rohrmann,    1986;  van  Strien  et  al . ,  1992). 

When  the  hydrophilic  and  hydrophobic  profiles  of  the 
gp41  polypeptide  of  SfMNPV- 2  were  compared  with  other  NPVs, 
the  SfMNPV-2  showed  an  overall  pattern  similar  to  that  of 
HzSNPV.     The  amino  acids  40  to  280  of  AcMNPV-E2  and  BmMNPV 
showed  an  identical  hydrophobic  pattern  with  amino  acids  100 
to  340  of  HzSNPV  and  SfMNPV-2.     The  high  hydrophilicity  of 
the  carboxyl  terminal  of  the  plO  gene  has  been  reported  and 
shows  that  it  displays  a  functional  domain  which  is  exposed 
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at  the  surface  of  the  protein.     The  hydrophobic  region  in 
the  middle  of  the  plO  protein  may  play  a  bundling  or  cross- 
linking  function   (van  Oers  et  al .  ,   1993) .     The  amino  acid 
sequences  of  the  gp41  polypeptide  of  these  NPVs  were 
compared  to  reveal  the  conserved  sequence  regions  (Fig. 
2.5) .     These  conserved  amino  sequences  may  play  an  important 
role  to  be  a  functional  domain  since  no  amino  acid  change 
was  found  in  those  regions.     Specifically,   these  regions 
containing  the  proline  and  cysteine  may  be  involved  in 
maintaining  the  gp41  polypeptide  conformation.     In  addition 
to  these  conserved  regions,   the  alignment  of  the  first  50 
amino  acids  between    AcMNPV  and  BmMNPV  were  identical. 
Also,   the  last  368  to  393  amino  acid  sequences  between 
SfMNPV-2  and  HzSNPV  were  almost  identical    (Fig.   2.5).  These 
data  suggest  that  the  SfMNPV-2  and  HzSNPV  may  have  evolved 
from  a  common  ancestor,   and  that  the  AcMNPV  and  BmMNPV 
diverged  from  another  distantly  related  ancestor. 

By  northern  blot  analysis,   two  gp41  gene  transcripts 
were  found  after  12  h  p.i.     These  data  confirm  the  data 
previously  shown,   that  the  gp41  gene  is  a  late  gene  product 
(Whitford  &  Faulkner,    1992b;  Ma  et  al . ,   1993).     One  of  the 
transcripts  was  1.6  kb  and  another  was  2.8  kb  long. 


According  to  the  DNA  sequence,   the  distance  between  the  gp41 
gene  transcriptional  start  site  to  poly(A)    signal  is  1,433 
nucleotides.     By  adding  the  poly (A)   tail    (a  poly (A)  tail 
usually  contains  200  bases) ,   the  estimated  size  of  the  gp41 
gene  transcript  was  about  1.6  kb.     On  the  other  hand,  the 
2.8  kb  transcript  did  not  fit  the  transcription  termination 
stop  signal  principle.     One  explanation  for  the  2.8  kb 
transcript  is  the  poly (A)    signal  which  was  located  3  94 
nucleotides  downstream  from  the  translation  stop  codon  was 
bypassed.     This  phenomena  of  ignoring  the  major 
transcriptional  stop  signal  has  been  reported  both  in  the 
gp41  gene   (Whitford  &  Faulkner,    1992b)   and  in  the  p39  capsid 
gene  of  AcMNPV   (Thiem  &  Miller,    1989) .     Another  explanation 
for  the  two  different  size  transcripts  is  that  the  1.6  kb 
transcript  was  a  spliced  product  from  the  2.8  kb  RNA. 
However,   this  explanation  is  not  favored  because  the  gp41 
gene  coding  sequence  does  not  seem  to  be  separated  into  two 
regions.     The  gene  splicing  is  not  a  common  phenomena  in 
baculoviruses  except  for  the  IE1  or  IE0    (Kovacs  et  al . , 
1991).     The  2.8  kb  transcript  was  also  acknowledged  that 
could  be  a  transcription  product  of  the  gene  other  than  gp41 
since  the  SfMNPV  EcoRI-S  fragment  was  used  as  a  probe. 
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Totally  four  potential  open  reading  frame  were  identified 
within  the  SfMNPV  EcoRI-S  fragment. 

By  primer  extension  analysis,   the  transcription  start 
site  for  the  gp41  gene  mRNA  of  SfMNPV-2  was  mapped  in  the 
promoter  region  within  the  TAAG  motif  at  approximately 
nucleotide  -42  or  -41    (T  or  A) .     This  motif  is  conserved  in 
all  baculovirus  late  genes,   especially  the  baculovirus 
structural  proteins   (Rohrmann,   1986;   1992;  Rankin  et  al . , 
1988) .     However,   another  transcriptional  start  site  was 
located  at  the  -140  nucleotide  for  which  no  consensus  motif 
has  been  determined.     The  phenomenon  in  which  the 
transcription  start  site  is  dissimilar  to  a  late  gene 
consensus  motif  is  also  found  in  the  AcMNPV  p74  gene  (Kuzio 
et  al.,   1989).     Another  explanation  for  the  difference  could 
be  a  non-specific  primer  hybridization,   since  the 
baculoviruses  contain  a  large  DNA  genome. 

An  unexpected  small  ORF  was  located  downstream  of  the 
-140  nucleotide  transcriptional  start  site,   and  the  -140 
nucleotide  transcriptional  start  site  may  be  used  for  a 
bicistronic  transcription.     Similar  bicistronic  transcripts 
have  been  reported  by  Kovacs  et  al .    (1991) .     A  translat ional 
regulation  mechanism  is  proposed  in  that  paper  since  the 
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translation  of  the  downstream  ORF  is  more  efficient  compared 
to  the  upstream  ORF.     The  upstream  ORF  may  be  used  for 
increasing  the  translation  initiation  activity.     At  the  same 
time,   Ooi  and  Miller   (1991)   suggest  an  antisense  RNA 
mechanism  for  transcriptional  regulation,   which  may  be  used 
to  turn  off  a  3.2  kb  RNA  initiation.     In  the  transcription 
of  the  gp41  gene,   the  upstream  ORF  may  be  used  as  a 
competition  inhibitor  to  control  the  gp41  gene 
transcription.     However,   a  bicistron  model  could  not  be 
excluded  even  though  the  upstream  transcriptional  start  site 
is  not  a  common  transcriptional  start  site  for  baculovirus 
late  genes.     A  site  specific  mutation  at  the  upstream 
transcription  start  site  can  help  elucidate  if  this 
transcription  start  site  is  involved  in  the  gene  regulation 
of  gp4l. 

Kool  et  al.    (1994)    sequenced  the  AcMNPV-E2  EcoRI-C 
fragment  and  found  an  extra  G  residue  which  is  close  to  the 
end  of  the  gp41  gene  coding  region  when  comparing  it  with 
the  data  published  by  Whitford  &  Faulkner   (1992b) .  These 
results  were  confirmed  by  the  recent  data  of  Ayres  et  al . 
(1994) .     The  differences  in  the  gp41  gene  sequences  of 
AcMNPV  may  be  caused  by  using  a  different  strain.  The 
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results  from  Kool  et  al .    (1994)   and  Ayres  et  al .    (1994)  not 
only  enlarge  the  gp41  protein  by  65  amino  acid  sequences  but 
also  increase  the  homology  with  HzSNPV  and  SfMNPV  at  the  C- 
terminal  regions    (Fig.   2.5).     These  data  provide  new 
information  showing  the  possible  evolutionary  path  of  the 
gp41  gene  and  by  comparing  these  data,   the  evolutionary 
relationship  of  baculoviruses  may  be  inferred. 


CHAPTER  3 

NUCLEOTIDE  SEQUENCE,   AMINO  ACID  SEQUENCE  AND  GENOMIC 
STRUCTURE  ANALYSIS  OF  THE  GP41  GENE  REGION  AMONG  FIVE 
NUCLEAR  POLYHEDROSIS  VIRUSES 

Introduction 

Anticarsia  gemmatalis  MNPV   (AgMNPV)   belongs  to  the 
genus  Nucleopolyhedrovirus   (family:  Baculoviridae)   with  a 
133-kbp,   closed-circle  double-stranded  DNA  genome    (Murphy  et 
al .  ,   1995)  .     The  virus  has  been  applied  as  a  commercial 
insecticide  on  a  large  scale  to  control  the  soybean  pest,  A. 
gemmatalis   (velvetbean  caterpillar) ,   in  Brazil  (Moscardi 
1989) .     In  addition  to  the  successful  field  application,  the 
AgMNPV  has  undergone  a  series  of  comprehensive  laboratory 
studies  including  the  construction  of  the  genomic  map 
(Johnson  and  Maruniak,   1989) ,   the  nucleotide  sequence  of  the 
polyhedrin  gene   (Zanotto  et  al . ,    1992),   and  the 
identification  and  sequence  of  a  variable  region,  homologous 
region  4    (hr-4)    (Garcia-Maruniak  et  al . ,  1996). 

The  gp41  structural  protein  is  a  major  occluded  virion 
(OV)   glycoprotein  of  baculoviruses    (Maruniak,   1979) .  The 
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monoclonal  data  indicate  the  gp4l  is  associated  with  OV,  but 
not  with  the  purified  nucleocapsid  nor  with  the  budded 
virion   (BV)    (Whitford  &  Faulkner,   1992a;  Ma  et  al .  ,  1993). 
The  location  of  the  gp4l  protein  is  predicted  to  be  the 
tegument  between  the  envelope  and  the  capsid   (Whitford  & 
Faulkner,   1992a) .     However,   the  biological  function  of  the 
gp4l  protein  is  still  unknown  because  of  the  unsuccessful 
selection  of  the  recombinant  mutants,   which  suggested  the 
gp4l  may  be  an  essential  gene. 

Recently,   the  developments  of  bioinf ormatic  analysis 
bring  a  new  aspect  for  studying  gene  function  in  terms  of 
using  the  primary  nucleotide  and/or  amino  acid  sequence  to 
predict  the  biological  function  of  a  protein.  Several 
computer  programs  are  available  through  public  access 
including  a  protein  secondary  structure  analysis  program 
that  shows  more  than  70%  accuracy   (Rost  and  Sander,   1993) ,  a 
transmembrane  domain  prediction  program   (Jones  et  al . , 
1994) ,   an  O-glycosylation  sites  prediction  program  (Hansen 
et  al . ,   1995),   and  a  three  dimensional  structure  protein 
comparison  program   (Madej   et  al . ,    1995).     These  computer 
programs  provide  theoretical  data  before  the  laboratory  data 
is  obtained,   and  are  also  useful  for  designing  laboratory 
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experiments . 

In  this  study,   the  gp4l  nucleotide  sequence  of  AgMNPV- 
2D  was  compared  with  the  nucleotide  sequences  of  Autographa 
californica  MNPV   (AcMNPV)    (Kool  et  al . ,    1994),   Bombyx  mori 
MNPV   (BmMNPV)    (Nagamine  et  al . ,   1991),   Helicoverpa  zea  SNPV 
(HzSNPV)    (Ma  et  al . ,   1993)   and  Spodoptera  frugiperda  MNPV-2 
(SfMNPV-2)    (Liu  Sc  Maruniak,   1995)   gp4l  regions  to  understand 
the  relationship  of  AgMNPV-2D  with  other  NPVs .     A  protein 
secondary  structure  analysis  was  done  based  on  different 
computer  programs  to  predict  the  potential  motifs 
responsible  for  the  biological  function  of  the  gp4l  protein. 
Lastly,   the  genomic  structure  of  gp4l  gene  regions  among 
five  different  NPVs  was  compared  to  provide  some  indications 
of  the  phylogenetic  relationships. 


Methods 

Virus  and  Cell  Culture 

The  AgMNPV- 2D  isolate    (Maruniak,   1989)   was  used  as  the 
virus  source  and  propagated  in  the  Sf-9    (S.   frugiperda,  fall 
armyworm)   cell  line   (Luckow  &  Summers,   1988) .     The  Sf-9  cell 
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line  was  maintained  at  27°C  in  TC-100  medium  with  10%  fetal 
bovine  serum   (Life  Technologies) . 

DNA  Cloning  and  Sequencing 

Southern  blot  hybridization  was  employed  to  locate  the 
gp4l  gene  of  AgMNPV- 2D .     A  DNA  fragment  of  SfMNPV-2  within 
the  gp4l  gene   (described  in  Liu  &  Maruniak,   1995)  was 
labeled  with  32p- [dCTP]   using  a  nick  translation  kit  (United 
States  Biochemical  Corp.),   and  used  as  a  probe.     The  AgMNPV- 
2D  gp4l  gene  was  first  mapped  to  the  9  kbp  Hindlll-C 
fragment    (Fig.   3.1) .     Subsequently,   the  gp4l  gene  was 
localized  within  a  3.5  kb  Pstl-Hindlll  fragment    (at  49.8  - 
52.4  map  unit,   m.u.)   which  was  cloned  into  the  pGEM7Zf (+) 
plasmid   (Promega  Corp.).     A  series  of  exo-nuclease  deletion 
subclones  was  constructed  for  sequencing  purposes  using  the 
Erase-a-Base  system   (Promega  Corp.).     A  modification  of 
experimental  protocol  was  made  to  precipitate  the  exo- 
nuclease-digested  DNA  before  the  next  step  of  DNA  ligation, 
because  an  incomplete  inhibition  of  exo-nuclease  was  found 
when  the  manufacturer's  instructions  were  followed.  The 
extra  DNA  precipitation  step  was  introduced  between  the  SI 
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AgMNPV 


-  polh 


PstI      Sacll  Hindi         Sacll  Hindi  Hindlll 


1.005  kb 
gp41  ORF 


Figure  3.1.   Position  of  the  gp41  gene  on  the  AgMNPV- 2D 
genomic  map.     The  gp41  1,005  kb  open  reading  frame  is 
indicated  by  the  arrow  under  the  map.     Notice  the  gp41  gene 
and  polyhedrin  gene  have  the  same  transcription  direction 
that  is  from  right  to  left  in  the  conventional  map. 
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nuclease  digestion  and  Klenow  enzyme  treatment.     The  dideoxy 
nucleotide  chain-terminator  method  was  performed  for  DNA 
sequencing,   and  the  DNA  sequence  gap  between  different 
deletion  subclones  was  completed  using  synthesized  oligo- 
nucleotide primers.     Two  different  sequencing  kits  were 
used:   the  Sequenase™  Version  2.0  DNA  Sequence  Kit  with 
Sequenase  polymerase   (United  States  Biochemical  Corp.)  and 
fmol™  DNA  Sequencing  System  with  Taq  DNA  polymerase 
(Promega  Corp.) . 

Computer  Analysis 

The  Wisconsin  Sequence  Analysis  Package™   (Version  8.1, 
VMS;  Genetic  Computer  Group)   was  used  for  comparing  the 
nucleotide  sequence  and  amino  acids  sequence  identities 
(GAP) ,   generating  the  multiple  sequence  alignment   (Pileup) , 
and  plotting  the  hydrophobicity  profile    (Pepplot) .  The 
Blast  program   (Altschul  et  al . ,    1990)   was  used  to  search  the 
GenBank  and  SwissProt  data  banks  for  the  homologous 
nucleotide  sequences  and  amino  acid  sequences  through  the 
e-mail  service   (Appendix  B)   at  the  National  Center  for 
Biotechnology  Information    (NCBI,   USA) .     The  protein 
secondary  structure  prediction  program   (Rost  and  Sander, 
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1993)   was  available  through  the  Internet  server   (Appendix  B) 
at  the  European  Molecular  Biology  Laboratory  (EMBL: 
Heidelberg,   Germany) .     The  transmembrane  domain  analysis 
program   (MEMSAT)    is  a  freeware   (Jones  et  al .  ,    1994),   and  the 
O-glycosylation  site  prediction  program   (Appendix  B)  was 
accessed  through  the  Internet  server   (Hansen  et  al . ,   1995)  . 

For  phylogenetic  analysis,   the  MEGA  program  was  used  to 
construct  the  phylogenetic  tree  of  the  gp4l  gene   (Kumar  et 
al . ,   1993) .     Both  the  nucleotide  sequences  and  amino 
sequences  were  used.     The  p-distance  and  neighbor- joining 
methods  were  chosen  to  generate  the  phylogenetic  tree  based 
on  amino  acid  sequences.     For  the  phylogenetic  tree  based  on 
nucleotide  sequences,   the  p-distance  and  maximum  parsimony 
method  were  used. 

Results 

DNA  Sequencing  of  the  GP41  Region 

The  complete  nucleotide  sequence  of  the  Pstl-Hindlll 
fragment  resulted  in  3,517  nucleotides    (Appendix  C)   and  has 
been  deposited  in  GenBank  under  the  accession  number  U37728. 
An  interesting  phenomenon  was  observed  during  the  DNA 
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sequencing.     When  the  fmolTI^  DNA  sequencing  system  was  used 
for  DNA  sequencing,   one  inconsistent  nucleotide  pair  was 
always  found   (three  repetitions)   at  nucleotide  1,116,  C 
versus  T,   from  the  gp4l  coding  strand  and  non-coding  strand. 
The  data  were  confirmed  by  the  Sequenase  sequencing  system 
which  showed  this  specific  nucleotide  pair  should  be  C/G. 
No  specific  secondary  structure  of  DNA  was  found  around  the 
nucleotide  at  1,116. 

An  open  reading  frame   (ORF)   of  1,005  nucleotides  was 
identified  containing  the  gp4l  gene  from  nucleotide  669  to 
1,673.     The  transcriptional  direction  of  this  gene  was 
oriented  from  right  hand  to  left  hand   (relative  to  the 
AcMNPV  polyhedrin  gene)    in  the  conventional  genome  map  (Fig. 
3.1).     Two  NPV  late  gene  motifs    (TAAG)   were  found  at   -17  to 
-20  and  -48  to  -51  nucleotides  from  the  protein  translation 
initiation  site   (ATG)   respectively.     A  transcriptional  stop 
signal  AATAAA  was  found  downstream  at  nucleotide  745  from 
the  translation  stop  site    (TGA) .     In  additional  to  the 
transcriptional  motifs,   the  translation  start  site  fits  the 
Kozak  principle  of  AXXATG (A/G)    (Kozak,    1986) .     When  the 
nucleotide  sequence  and  translated  amino  acid  sequence  were 
compared  with  four  other  published  NPVs  gp4l  gene  sequences 
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using  the  GAP  program  obtained  from  the  GCG  package,  more 
than  59%  nucleotide  sequence  identities  and  more  than  69% 
amino  acid  sequence  similarities  were  found   (Table  3.1) . 

In  addition  to  the  gp4l  ORF,   several  ORFs  of  AgMNPV- 2D 
were  found  inside  the  3.5  kbp  sequence  region.     The  AgMNPV - 
2D  ORF  1062  was  identified  to  have  a  high  homology  with  the 
AcMNPV  vlf-1  gene.     The  AgMNPV- 2D  vlf-1  gene  was  then 
compared  with  the  vlf-1  of  AcMNPV,   BmMNPV,   the  ORF  >300  of 
SfMNPV-2  and  the  ORF  >195  of  HzSNPV.     The  results  presented 
a  nucleotide  homology  of  76,   77,   63,   and  6  5%  respectively 
and  amino  acid  similarity  of  91,    90,   78  and  66% 
respectively   (Table  3.1). 

Other  than  the  vlf-1  gene,   two  potential  ORFs    (ORF  330 
and  ORF  300)   were  found  at  nucleotides,   1,804  -  2,103  and 
2,100   -  2,429  respectively.     The  ORF  330  of  the  AgMNPV- 2D 
was  compared  with  the  ORF  330  of  AcMNPV,   ORF  33  0  of  BmMNPV, 
ORF  348  of       SfMNPV-2,   and  ORF  33  0  of  HzSNPV  and  showed  high 
nucleotide  homologies  of  68,    65,   58,   and  57%  respectively 
(similarity  of  amino  acid  sequences  of  78,   80,   60,   and  64% 
respectively;  Table  3.1).     The  data  suggested  there  were 
minimal    (50-60%)   homologies  and  similarities  among  these 
analyzed  NPVs .     Meanwhile,    the  AgMNPV- 2D  ORF  300  showed 
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Table  3.1.  Precentage  of  the  nucleotide  sequence  identities  and  amino  acid 
sequence  similarities  of  the  ORFs  within  the  gp41  gene  region*. 


C  -F  "VTTJDV 
o  L  VLvi  ¥  V 

tl  Z  o  Vt  XT  V 

QrrMMDV 
nMrln  c  V 

vlf-1 

AcMNPV 

97 

(33 ) 

65 

65 

(  111 

76 

( 91) 

BmMNPV 
L33180 

67 

(80) 

61 

(71) 

77 

(90) 

SfMNPV 
U14725 

63 

(73)  c 

63 

(78)  ' 

HzSNPV 
L047471 

65 

(66) 

ORF  327 

AcMNPV 

95 

(96) 

51 

(56) 

55 

(64) 

68 

(78) 

BmMNPV 

53 

(59) 

54 

(64) 

65 

(80) 

SfMNPV 



54 

(61) 

58 

(60) 

HzSNPV 



_  _ 

57 

(64) 

ORF  312 

AcMNPV 
BmMNPV 

99 

(100) 







_  _ 

70 

(75) 

70 

(75) 

gp41 

AcMNPV 

98 

(96) 

59 

(70) 

60 

(75) 

70 

(82) 

BmSNPV 

59 

(72) 

60 

(74) 

74 

(80) 

SfMNPV 

75 

(62) 

58 

(69) 

HzMNPV 

59 

(71) 

ORF  699 

AcMNPV 

94 

(97) 

58 

(70)2 

54 

(70)2 

69 

(80)2 

BmMNPV 

59 

(70)2 

55 

(70)2 

68 

(81)2 

SfMNPV 

58 

(72)2 

58 

(68)2 

HzSNPV 

53 

(71)2 

"Bold  and  normal  lettering  in  parentheses  denote  nucleotide  sequence 
identities  and  amino  acid  similarities,  respectively. 

^enBank  accession  number.   The  sequence  of  AgMNPV  gp41  gene  region  has 
been  deposited  under  U37728. 

incomplete  ORFs  were  used  for  amino  acid  sequence  comparison. 
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homologies  with  ORF  312  of  AcMNPV  and  BmMNPV   (70%  homology 
and  75%  similarity;  Table  3.1).     However,   there  were  no 
homologous  sequences  found  between  the  AgMNPV-2D  ORF  3  00 
with  the  SfMNPV-2  and  HzSNPV  gp41  regions. 

In  addition  to  the  intact  ORFS,   one  partial  ORF  >  667 
was  found  at  nucleotides  1-667  which  had  moderate  nucleotide 
sequence  homology   (69%)   but  high  amino  acid  similarity  (80%) 
with  AcMNPV  ORF  699.     When  the  partial  ORF  >667  of  the 
AgMNPV- 2D  was  compared  with  the  ORF  699  of  AcMNPV,   ORF  702 
of  BmMNPV,   ORF  >258  of  SfMNPV-2  and  ORF  >299  of  HzSNPV,  the 
results  indicated  a  nucleotide  homology  of  69,   68,   58,  and 
53%  respectively  and  an  amino  acid  similarity  of  80,   81,  68, 
71%  respectively   (Table  3.1). 

Phylogenetic  Analysis 

Based  on  the  nucleotide  sequences  and  translated  amino 
acid  sequences,   a  phylogenetic  tree  of  the  gp41  gene  (Fig. 
3.2)   was  generated  by  the  Pileup  program   (GCG  package)  and 
MEGA  package   (Kumar  et  al . ,   1993).     The  results  showed  that 
AcMNPV  and  BmMNPV  were  closely  related.  Subsequently, 
HzSNPV  and  SfMNPV-2  were  grouped  into  a  branch  and  AcMNPV, 
BmMNPV  and  AgMNPV- 2D  were  grouped  into  another  branch. 
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Figure  3.2.   Phenogram  of  the  divergence  among  five  NPVs 
based  on  the   (A)   nucleotide  sequences  and   (B)   the  amino  ac 
sequences  of  the  gp41  genes.     The  number  on  the  top  of  lin 
represents  the  distance  between  each  NPV  or  to  the  branch 
point . 
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Protein  Hydrophobicity  Profile  Analysis 

Figure  3.3  shows  the  hydrophobicity  profile  and  the 
conserved  hydrophobic  domain  of  gp41  protein  among  five  NPVs 
(Kyte  &  Doolittle,   1982)  .     Five  conserved  hydrophobic 
domains  were  assigned  arbitrarily  based  on  the  similarity  of 
hydrophobic  pattern  among  five  NPVs. 

Protein  Secondary  Structure  Analysis 

The  amino  acid  sequence  alignment  showed   (Fig.   3.4)  two 
cysteines  and  nine  prolines  were  found  conserved  among  five 
different  NPVs.     The  secondary  structure  analysis  showed 
eight  potential  a-helixes,   four  loops  and  one  P -sheet. 
Several  conserved  domains  were  found  inside  these  specific 
secondary  structures.     Most  of  the  conserved  domains  were 
found  in  the  middle  of  the  gp4l  amino  acid  sequences.  The 
amino  and  carboxyl  terminals  were  highly  variable. 
No  N-glycosylation  sites,   Rx(S/T),   were  presented  in  Fig. 
3.4,   because  the  gp4l  protein  has  been  reported  as  an  0- 
linked  glycoprotein.     However,   no  consensus 
0-glycosylation  sites  were  predicted  by  the  aligned 


Figure  3.3.  Hydrophobicity  profile  of  the  gp41  protein  among 
five  different  NPVs .  Conserved  hydrophobic  domains  I-V  were 
arbitrarily  assigned   (see  text  for  details) . 
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Figure  3.4.  Alignment  of  the  amino  acid  sequences  of  the 
gp41  protein  among  five  different  NPVs .     CONS  represents  the 
consensus  sequence.     The  dots  indicate  the  gap  and  the 
dashes  indicate  the  gaps  or  non- conserved  sequences  (for 
CONS  sequence) .     The  conserved  proline  sites  are  denoted  by 
the  *  symbol  and  the  conserved  cysteine  sites  are  denoted  by 
the  @  symbol.     Specific  secondary  structure  domains  were 
labeled  inside  the  boxes,   and  the  transmembrane  domain  is 
highlighted  by  double  underlines. 
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Figure  3.4.  Continued. 
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sequences.     One  potential  transmembrane  segment  was  found 
close  to  the  carboxyl  end  with  a  consensus  sequence  of 
ENX4APLSAX3IFX  using  the  transmembrane  domain  prediction 
program . 

Genomic  Structure  Analysis 

The  genomic  structures  of  the  gp4l  gene  flanking 
regions  of  the  AgMNPV-2D  were  analyzed   (Fig.   3.5) .     When  the 
whole  gp4l  gene  regions  of  five  NPVs  were  aligned,  they 
showed  similar  genomic  structures  and  transcriptional 
orientations   (relative  to  the  transcription  direction  of  the 
AcMNPV  polyhedrin  gene)   with  the  exception  of  HzSNPV.  In 
general,   the  gp4l  gene  regions  were  located  at  m.u.   45  to 
52,   but  the  gp4l  gene  region  of  HzSNPV  was  located  at  m.u. 
96.5  to  97.6.     Also,   the  transcriptional  direction  of  all 
the  ORFs  of  HzSNPV  is  opposite  to  other  NPVs. 

Discussion 


In  summary,   the  nucleotide  sequence  of  the  gp4i  gene 
region  of  the  AgMNPV- 2D  was  sequenced.     Several  ORFs  were 
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Figure  3.5.  Genomic  structure  of  gp41  gene  flanking  regions 
of  AcMNPV,   BmMNPV ,   AgMNPV- 2D ,    SfMNPV- 2,    and  HzSNPV.  * 
refers  to  the  ORF  which  was  not  found  in  either  SfMNPV  or 
HzSNPV.     Note  the  data  of  HzSNPV  is  modified  from  isolate 
HzS-15  which  is  considered  as  a  genomic  rearrangement 
isolate   (see  text  for  details) . 
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identified  including  the  vlf-1  gene,   ORF  330,   ORF  300,  gp4l 
gene,   and  ORF  >667.     Among  these  ORFs,   the  AgMNPV- 2D  shared 
50  to  70%  of  the  nucleotide  sequence  identities  and  60  to 
80%  of  the  amino  acid  sequence  similarities  with  four  other 
NPVs.     However,   the    AgMNPV- 2D  ORF  300  did  not  show 
homologies  with  the  gp41  regions  of  all  five  NPVs.     The  gp4l 
gene  region  of  SfMNPV-2  and  HzSNPV  did  not  contain  the  ORF 
300  homologous  sequences.     This  result  may  be  caused  by  a 
genomic  deletion.     However,   it  was  not  shown  whether  a 
homologous  sequence  of  the  AgMNPV  ORF  3  00  was  present  in  a 
different  genomic  region  of  SfMNPV-2  or  HzSNPV. 
Furthermore,   the  homologous  sequences  of  the  AgMNPV- 2D  ORF 
300  were  searched  using  the  BLAST  program  and  no  significant 
homologous  sequence  was  found  other  than  the  AcMNPV  and 
BmMNPV  ORF  312 . 

The  gp4l  gene  is  a  unique  gene  which  is  only  found  in 
the  OV.     However,   no  biological  function  has  been  proved 
yet.     An  attempt  to  select  a  recombinant  virus  with  a 
deletion  in  the  gp4l  gene  was  not  successful.     The  results 
suggested  the  gp41  gene  could  be  an  essential  gene  and  have 
influences  on  both  BV  and  OV  even  though  the  gp41  protein  is 
only  found  in  the  OV.     If  the  gp41  gene  is  an  essential 
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gene,   a  transformed  cell  line  that  constantly  expresses  the 
gp41  protein  will  be  needed  to  complement  the  gp41  gene  when 
the  gp41  gene  deletion  mutant  is  selected.  Nevertheless, 
several  computer  programs  were  used  to  predict  the  potential 
biological  function  based  on  the  biochemical 
characterization  of  the  gp4l  protein. 

Four  a-helices  at  consensus  sites  of  93  to  104,   204  to 
222,    244  to  257,    and  273   to  284  (CvDyxkliRyY, 
EaakqLsiAvQYmvaeaV,   qQLaNnYxTLLLkr ,    and  IndLINxVIDDl ) ,  one 
loop  domain  at  237  to  241,    (PIPLP) ,   and  one  P-sheet  domain 
at  292  to  295    (YYxYV)   were  found  to  be  conserved   (Fig.  3.4). 
The  results  were  confirmed  using  both  the  PHD   (EMBL)  and 
Darwin  programs    (Benner,   1995) .     One  transmembrane  domain 
was  predicted  at  amino  acid  sequences  of  309  to  328.  The 
transmembrane  domain   (Fig.   3.4),   was  also  found  to  be  a 
conserved  hydrophobic  domain   (Fig.   3.5).     The  results 
strongly  suggested  that  the  gp4l  protein  is  a  membrane 
protein.     The  hydrophobic  profile  revealed  five  conserved 
hydrophobic  domains,   and  region  III  was  also  found  to  be  a 
conserved  a-helix  domain.     The  correlation  of  the  conserved 
hydrophobic  domains  and  a-helix  may  suggest  that  region  III 
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has  a  specific  biological  function.     An  attempt  to  generate 
a  three-dimensional    (3D)   graph  using  the  threading  method 
(Madej   et  al .   1995)   was  not  successful  because  no  homologous 
sequence  against  the  gp41  protein  was  found  in  the  PDB 
(protein  data  bank) .     The  crystallographic  data  of  gp41  or  a 
closely  related  transmembrane  protein  will  be  needed  to 
generate  the  3D  graph  of  the  gp4l  protein. 

In  contrast  with  the  gp4l  px'otein,   the  gp64  protein  is 
only  found  in  the  BV.     The  gp64  is  a  glycosylated  membrane 
protein  and  is  involved  in  cell  to  cell  infection.    .  It  has 
been  proved  to  be  an  essential  gene  for  baculovirus 
infectivity.     Two  conserved  hydrophobic  domains  at  amino 
acid  sequences  of  220  to  230  and  327  to  338    (TELVACLLIKD  and 
LNNMMHDL I YS V )   were  associated  with  biological  function. 
Region  I  is  involved  in  the  fusion  activity  of  the  gp64 
protein,   and  region  II  is  involved  in  the  oligomerization 
and  transport  of  gp64  protein   (Monsa  &  Blissard,   1995) . 
Also,   one  transmembrane  domain  was  identified  at  the 
carboxyl  terminal.     No  similarity  of  amino  acid  sequence  was 
found  between  the  gp64  and  gp41  transmembrane  domain.  The 
study  of  the  similarities  of  the  secondary  structure  of  the 
gp4l  and  gp64  proteins  will  provide  information  for 
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understanding  the  baculovirus  structural  proteins. 

The  AcMNPV  vlf-1  gene  is  a  very  late  expression  factor 
to  regulate  the  polyhedrin  gene  transcripts   (McLachlin  & 
Miller,   1994),   and  is  required  for  strong  expression  of  the 
polyhedrin  gene  in  a  characterized  temperature  sensitive 
mutant.     The  translated  amino  acid  sequence  showed  homology 
with  a  family  of  integrases,   resolvases  and  RNA  helicases 
(McLachlin  &  Miller,   1994)   which  may  be  involved  in  the 
interaction  with  DNA  and/or  RNA  during  the  transcription. 
Unfortunately,   the  partial  amino  acid  sequence  of  SfMNPV  and 
HzSNPV  did  not  overlap  with  these  specific  motifs,   and  no 
further  analysis  was  done  because  of  insufficient 
information. 

The  phylogenetic  analysis  of  the  gp4l  gene  showed  the 
AgMNPV- 2D  had  a  closer  relationship  to  the  AcMNPV  and  the 
BmMNPV  than  to  SfMNPV- 2  and  HzSNPV.     This  result  is 
consistent  with  the  DNA  hybridization  data   (Smith  &  Summers, 
1982) ,    in  which  AcMNPV  was  found  to  have  low  homology  with 
HzSNPV  and  SfMNPV   (1%  relative  homology)   but  moderate 
homology  with  AgMNPV- 2D   (8%  relative  homology) .     Not  only 
the  DNA  hybridization  data,   but  also  the  phylogenetic  tree 
of  baculovirus  polyhedrin  genes  agrees  with  the  phylogenetic 
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tree  of  the  gp4l  gene   (Cowan  et  al . ,    1994;   Zanotto  et  al .  , 
1993) .     The  results  of  the  phylogenetic  tree  of  the 
polyhedrin  gene  also  divided  the  AcMNPV,  AgMNPV-2D,  and 
BmMNPV  into  one  group  and  HzSNPV  and  SfMNPV-2  into  another 
group . 

Overall,   the  genomic  structure  of  the  gp41  gene  region 
showed  that  all  of  the  NPVs  have  similar  local  ORF 
arrangements  except  HzSNPV.     The  HzS-15  isolate  analyzed  in 
the  present  study  was  described  as  a  rearranged  genomic 
isolate  based  on  the  overall  genomic  structural  comparison 
with  another  HzSNPV  isolate,   ELCAR   (Cowan  et  al. ,    1994)  .  . 
The  gp4l  gene  of  the  HzS-15  isolate  terminates  upstream  of 
the  polyhedrin  gene,   near  m.u.    97.     But  the  gp4l  gene  of 
HzSNPV  ELCAR  isolate  is  placed  downstream  of  the  DNA 
polymerase -related  ORF,   near  m.u.   50,   which  is  far  away  from 
the  polyhedrin  gene.     This  explains  why  the  HzS-15  has  a 
different  genomic  and  transcriptional  orientation.  The 
reason  we  did  not  use  the  isolate  ELCAR  instead  of  HzS-15  is 
because  of  the  incomplete  sequence  in  the  gp4l  gene  region 
(specific  for  the  gp4l  gene) .     When  the  HzSNPV  isolate  ELCAR 
instead  of  HzS-15  was  compared  with  four  other  NPVs,  we 
found  the  gp4l  gene  regions  are  always  located  around  m.u. 
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45  to  m.u.   52.     These  data  indicate  that  most  NPVs  still 
maintain  similar  genomic  structures  even  though  there  is  a 
mechanism  for  genomic  DNA  rearrangement . 


CHAPTER  4 

PHYLOGENETIC  ANALYSIS  OF  BACULOVIRUSES 


Introduction 

The  evolutionary  relationships  among  baculoviruses  have 
been  predicted  using  molecular  approaches.     Until  1996, 
three  baculovirus  genes  including  the  polyhedrin   (polh)  gene 

(Rohrmann,    1986;   Zanotto  et  al .  ,    1993;   Cowan  et  al . ,  1994), 
the  DNA  polymerase   (dnapol)  gene    (Pellock  et  al . ,   1996)  and 
the  ecdysteroid  UDP-glucosyltransf erase   (egt)   gene  (Barrett 
et  al .  ,   1995)   have  been  used  to  reconstruct  the  phylogenetic 
trees.     The  results  based  on  the  polh  gene  of  baculoviruses 

(Rohrmann,    1986;   Zanotto  et  al .  ,    1993;   Cowan  et  al..-,  1994) 
suggest  that  dipteran  NPVs  and  hymenopteran  NPVs  diverge 
from  the  lepidopteran  NPVs  and  GVs  before  they  split.  The 
phylogenetic  tree  of  the  baculovirus  dnapol  genes  is 
reconstructed  using  six  baculoviruses  including  Autographa 
calif ornica  MNPV   (AcMNPV) ,    Bombyx  mori  MNPV   (BmMNPV) ,  Orgyia 
pseudotsugata  MNPV   (OpMNPV) ,    Choristoneura  fumiferana  MNPV 
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(CfMNPV) ,   Helicoverpa  zea  SNPV   (HzSNPV)    and  Lyman tria  dispar 
MNPV   (LdMNPV)    (Ahrens  &  Rohrmann,   1996),   and  is  generally 
comparable  to  the  phylogenetic  tree  scheme  based  on  the  polh 
gene.     Furthermore,   the  dnapol  genes  of  two  baculoviruses , 
AcMNPV  and  HzSNPV,   are  compared  with  two  other  insect  DNA 
viruses   {Spodoptera  ascovirus,   SAV,   and  Choristoneura 
biennis  entomopoxvirus ,   CbEPV)    (Pellock  et  al . ,   1996),  and 
with  human  viruses  to  reveal  their  evolutionary 
relationships.     The  results  suggest  that,  the  baculoviruses 
have  an  independent  evolutionary  pathway  from  other  insect 
and  human  viruses.     Phylogenetic  analysis  of  the  third 
baculovirus  gene'  (egrt)   among  six  different  baculoviruses 
shows  similar  topology  to  the  phylogenetic  trees  of  polh  and 
dnapol  genes    (Barrett  et  al . ,    1995) . 

Although  the  molecular  approach  can  be  used  to 
elucidate  the  evolutionary  relationships  among 
baculoviruses,   critics  agree  that  the  phylogenetic  tree  of  a 
particular  gene  does  not  represent  the  evolutionary  pathway 
of  the  whole  organism   (Li  &  Graur,   1991) .     So  far,  all 
baculovirus  phylogenetic  trees  are  based  on  a  single  gene, 
and  therefore  may  not  properly  represent  the  evolutionary 
pathway  of  baculoviruses.     In  the  present  study,  this 
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problem  is  approached  using  a  congruent  analysis  (Miyamoto, 
1985;  Wheeler,   1991).     The  evolutionary  relationship  of 
baculoviruses  is  revealed  based  on  multiple  phylogenetic 
trees  of  baculovirus  genes  instead  of  a  single  gene.  The 
congruent  results  are  concluded  from  six  different 
phylogenetic  trees  of  baculovirus  genes  including  either 
structural  proteins   (polh,  plO,   gp64,   and  gp41)   or  enzymatic 
proteins   {dnapol  and  egt) .     The  results  will  provide  more 
solid  support  for  a  current  hypothesis  of  baculovirus 
evolutionary  pathway. 

Methods 

DNA  Purification  of  LdMNPV 

Lymantria  dispar  MNPV   (LdMNPV)    DNA   (GYPCHEK,  U.S. 
Forest  Service)   was  purified   (Appendix  D)    from  a  commercial 
preparation  of  polyhedra  and  used  as  a  DNA  template  for  PCR 
amplification. 

PGR  Amplification  and  DNA  Sequencing  of  LdMNPV  gp41  Gene 
A  set  of  polymerase  chain  reaction   (PCR)   primers  was 
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constructed  to  amplify  the  gp41  gene  of  LdMNPV.  The 
oligonucleotide  primers  were  designed  based  upon  the 
conserved  sequences  of  gp41  genes  from  five  baculoviruses 
including  AcMNPV   (Kool  et  al .  ,    1994),   Anticarsia  gemmatalis 
MNPV   (AgMNPV)    (Liu  &  Maruniak,   unpublished  data) ,  BmMNPV 
(Nagamine  et  al . ,    1991),   HzSNPV   (Ma  et  al . ,    1992),  and 
SfMNPV   (Liu  &  Maruniak,    1995)  . 

The  JM37  upstream  primer  of  the  gp41  gene  was  a  25 
nucleotide  oligomer  with  the  following  sequence: 
ACAA ( C/T) AA ( C/T) TATATTATAAGTA (A/G) TCC .     This  primer  was 
located  within  the  transcriptional  initiation  site  region  of 
the  gp41  gene.     The  JM40  downstream  primer  was  a  21 
nucleotide  oligomer  with  the  following  sequence: 
GTTGTAAAA (C/T) TTTTGNGC (G/A) TA.     Based  on  DNA  sequence 
alignment,   the  expected  size  of  the  PCR  product  using  this 
primer  set  was  around  500  base  pairs    (bp) . 

The  PCR  reaction  was  done  in  a  final  volume  of  25  ul 
containing  200  uM  of  each  dNTP,   4  pmoles  of  each  primer, 
2  mM  MgCl2,    0.5  units  of  Primezyme    (Biometra) ,   and  reaction 
buffer   (10  mM  Tris-HCi,   pH  8.8,    50  mM  KCl,    0.1%  Triton 
X-100) .     A  concentration  of  100  ng  of  DNA  template  (LdMNPV 
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genomic  DNA)   was  used  per  PCR  reaction.     Thirty  fil  of 
autoclaved  mineral  oil  was  applied  to  the  top  of  the 
reaction  mixture  to  prevent  evaporation.     The  PCR  reaction 
was  performed  in  a  PTC- 100  programmable  Thermal  Cycler 

(MJ  Research,    Inc) .   The  PCR  cycle  consisted  of  an  initial 
denaturation  step  at  95°C  for  1  min,   followed  by  35  cycles 
at  94°C  for  1  min   (denaturation),   45°C  for  1.5  min 

(annealing) ,   and  72°C  for  2  min   (extension) .     The  final 
extension  step  had  a  15  min  duration.     The  PCR  product  was 
purified  through  a  DNA  purification  column  (QIAquick™. 
Qiagen  Inc.)   to  remove  salts  and  enzyme. 

The  purified  PCR  product  was  then  cloned  into  a  pGEM-T 
vector   (Promega  Corp.),   and  sequenced  using  an  automatic 
sequencer   (ABI  373a)    from  the  DNA  Sequencing  Core  Laboratory 

(DSEQ)   of  the  Interdisciplinary  Center  for  Biotechnology 
Research   (ICBR)   at  the  University  of  Florida. 

Search  of  Baculovirus  Genes  through  GenBank 

The  BLAST   (Madden  et  al .  ,    1996)    and  ENTREZ    (Schuler  et 
al.,   1996)   programs    (Appendix  B)   available  from  the  National 
Center  for  Biotechnology  Information   (NCBI,   USA)   were  used 


to  search  the  homologous  sequences  of  six  baculovirus  genes. 
A  list  containing  the  GenBank  accession  numbers,  baculovirus 
species  names  and  related  references  used  in  the  present 
work  is  presented  in  Table  4.1. 

Twenty-three  nucleotide  sequences  of  baculovirus  polh 
genes  were  found  in  the  GenBank.     Three  undeposited  polh 
gene  sequences,  Anagrapha  falcifera  MNPV   (Dr.  Federici, 
personal  communication) ,   A.   gemmatalis  MNPV  and  Neodiprion 
sertifer  SNPV   (Zanotto  et  al .  ,   1993),   were  entered  manually 
into  a  Micro  VAX.  computer  at  the  Biological  Computing 
Facility   (BCF)   of  tne  ICBR  at  the  University  of  Florida. 


Table  4.1.   List  of  GenBank  accession  numbers,  baculovirus 
species  and  references  for  DNA  sequences  used  in  the 
construction  of  baculovirus  phylogenetic  trees. 


Accession 
number 

Baculovirus 
species 

Reference 

Polyhedrin 
gene 

DO  04 3 7 

Panolis  flanmea  MNPV 
(PfMNPV) 

Oakey  et  al. ,  1989. 
J. Gen. Virol.  70:769 

D01017 

Spodoptera  littoralis  MNPV 
(SpliMNPV) 

Croizer  &  Croizer, 
1994.  unpublished 

D14573 

Hyphantria  cunea  MNPV 
(HcMNPV) 

Isayama  et  al.,  1993. 
unpublished 

J04333 

Spodoptera  frugiperda  MNPV 
(SfMNPV) 

Gonzalez  et  al . ,  1989. 
Virology  170:160 
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K01149 

Autographa  calif ornica  MNPV 

( ACFLNFV  ) 

Hooft  van  Iddekinge  et 
3.1 .  ,           .  virology 
131 : 561 

tjfi  A  o  o  c 

Orgyia  pseudotsugata  MNPV 
(OpMNPV) 

Leisy  et  ai . ,  lSobD. 
Virology  153:280 

Mamestra  brassicae  MNPV 
(MbMNPV) 

Cameron  &  Possee, 
1989.  Virus  Res. 
125  : 183 

M23176 

Lymantria  dispar  MNPV 
(LdMNPV) 

Smith  et  al . ,    1988 . 
Gene  71:97 

MJ  U  y  z  r> 

Bombyx  mori  MNPV  (BmMNPV) 

Maeua  et  al .  ,  1985. 
Nature  315:529 

M32433 

Orgyia  pseudotsugata  SNPV 
(OpSNPV) 

Leisy  et  al . ,  1986a. 
J. Gen. Virol.  67:1073 

S48199 

Spodoptera  exigua  MNPV 
( seMNPV; 

van  Strien  et  al .  , 
1992.   J. Gen. Virol. 
73  :2813 

bo b4  D  Z 

flttacus  ncim  NPV  (ArMNPV) 

Hu  et  al .  ,    .1993  . 
I  Chuan  Hsueh  Pao 
20  :300 

U22824 

Perma  nuda  MNPV  (PnMNPV) 

Chou  et  al.  ,    1993  . 
unpublished 

U30302 

Leucania  separata  MNPV 
(LsMNPV) 

Wang  et  al.  ,    1996 . 
unpublished 

U40833 

Choristoneura  fumiferana 
MNPV  (CfMNPV) 

Rieth  et  al . ,   1996  . 
unpublished 

U40834 

Archips  cerasivoranus  MNPV 
(ArcMNPV) 

Rieth  et  al .  ,    1996 . 
unpublished 

X55658 

Malacosoma  neustria  MNPV 
(MnMNPV) 

Vladimir  &  Kavasan, 
1990.  unpublished 

X70844 

Buzura  suppressaria  MNPV 
(BsMNPV) 

Hu  et  al. ,    1993 . 

J. Gen. Virol.  74:1617 

X94437 

Spodoptera  litura  MNPV 
(S1MNPV) 

Bansal  et  al. ,    1996 . 
unpublished 

Z12117 

Helicoverpa  zea  SNPV 
(HzSNPV) 

Cowan  et  al . ,  1994. 
J. Gen. Virol.  75:3211 

K02910 

Trichoplusia  ni  GV  (TnGV) 

Akiyoshi  et  al . ,  1985. 
Virology  141:328 
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Y  no  A  Q  o 

AU  Z  ft  j  O 

rieilS    JJZaSSlCac  VFJjLjv.' 

Chakerian  et  al .  , 
1985.   J. Gen. Virol. 
66 : 1263 

X79569 

Cryptophlebia  leucotreta  GV 
(C1GV) 

Jehle  &  Backhaus, 
1994.   J. Gen. Virol. 
75 : 3667 

plO  gene 

M10023 

Autographs  calif ornica  MNPV 

Kuzio  et  al. ,   1984 . 
Virology  139:414 

1VJ  X  ft  O  O  J 

O-^^yia  pseudotsugata  MNPV 

Leisy  et  al . ,  1986c. 
Virology  153  : 157 

UQQC1 O 

Chori stoneura  fumiferana 
MNPV 

Wilson  et  al . ,  1995. 
J. Gen. Virol.  76:2923 

TT-i  CTC;7 
UiO  /  3  / 

Bonibyx  mori  MNPV 

Palhan  &  Gopinathan, 
1995.   thesis,  Indian 
Inst . Sci . ,  India 

U50411 

Perina  nuda  MNPV 

Chou  et  al . ,    1996 . 
unpublished 

X69615 

Spodoptera  exigua  NPV 

Zuidema  et  al.,  1993. 
J. Gen. Virol.  74:1017 

X92713 

Spodoptera  litura  NPV 

Behera  et  al .  ,   1996  . 
unpublished 

gp41  gene 

D1446  8 

I5UlIU3ysL    MOn  lvLN±rV 

Nagamine  et  al.,  1991. 
J .  Invertebr . Pathol . 
58  :290 

L04748 

Helicoverpa  zea  SNPV 

Ma  et  al. ,   1992 . 
Virology  192 -224 

U14725 

Spodoptera  frugiperda  MNPV 

Liu  &  Maruniak,  1995. 
J. Gen. Virol.  76:1443 

U37728 

.Anticarsia  gemmatalis  MNPV 
(AgMNPV) 

Liu  &  Maruniak,  1996. 
unpublished 

X71415 

Autographa  calif ornica  MNPV 

Kool  et  al. ,    1994 . 
J. Gen. Virol.  75:487 

gp64  gene 

L12412 

Choristoneura  fumiferana 
MNPV 

Hill  &  Faulkner,  1994. 
J. Gen. Virol.  75:1811 
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L33180 

Bombyx  mori  MNPV 

Maeda,  1994. 
unpublished 

M22446 

Orgyia  pseudotsugata  MNPV 

Blissard  &  Rohrmann, 
1989.   Virology  170:537 

M25420 

fly  tocrranha   ca  7  i  fnrn  i  ca  MNPV 

Whitfoi-ri   et  1989 
J.   Virol.  63:1393 

X00410 

(7a I  Ipri^   m(=> 7  1  nn&l  la  MNPV 
(GmMNPV) 

D±  X11U  V      CTL-     CIX./       JL  _7  C  *x  . 

FEBS  Lett.  167:254 

DNA 

T3olvmerase 
gene 

D11476 

T •\sma  n  1~  v~i  a    r\i  cn^  r  MWCV7 

XJ  V  1LIC3.11  L.J.  _L  d      Li  _L  *DLJd±.      1*11N  ir  v 

DJ  UI  Ufa  Oil    crC    ax  .  t     .LZ?  y  A  . 

J. Gen. Virol.  73:3177 

D16231 

Rnmhw   mnri  MMDU 
■i— > i_yi i IX-/ y  j\.    iiiLvx  J.    i  11  n  t  v 

i^iia ey uiiomsr  1   ec  3.J. .  , 
1995.  Virology, 
206  :436 

M20744 

Autographa  calif ornica  MNPV 

Tomalski  et  al .  ,  1988. 
Virology  167:591 

U11242 

Helicoverpa  zea  SNPV 

Cowan  et  al .  ,  1994. 
J. Gen. Virol.  75:3211 

U18677 

Choristoneura  fumi/erana 
MNPV 

Liu  &  Carstens,  1995. 
Virology  2  09:538 

U39145 

Orgyia  pseudotsugrata  MNPV 

Gross  et  al.,    1993.  : 
J.  Virol.  67:469 

U35732 

Spodoptera  Ascovirus 

Pellock  et  al . ,   1996  . 
Virology  216:146 

X57314 

Choristoneura  biennis 
entomopoxvirus 

Mustafa  &  Yuen,  1991. 
DNA  Sequence  2:39 

egt  gene 

D17353 

Orgyia  pseudotsugata  MNPV 

Rohrmann,  1994. 
unpublished 

L33180 

Bombyx  mori  MNPV 

Maeda,  1994. 
unpublished 

M22619 

Autographa  calif  ornica  MNPV 

Miller,  1989. 
unpublished 

U04321 

Lyman tria  dispar  MNPV 

Riegel  et  al.,  1994. 
J. Gen. Virol.  75:829 
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U10441 

Choristoneura  fumiferana 
MNPV 

Barrett  et  al . ,  1995. 
J. Gen. Virol.  76:2447 

U41999 

Mamestra  brassicae  MNPV 

Clarke  et  al . ,  1996. 
J. Gen. Virol,   in  press 

X84701 

Spodoptera  littoralis  MNPV 
(SlittMNPV) 

Faktor  et  al . ,   1995 . 
Virus  Genes  11:47 

Y08294 

Lacanobia  oleracea  GV  (LoGV) 

Smith  &  Goodale,  1996. 
unpublished 

In  addition  to  polh,   the  nucleotide  sequences  of  plO , 
gp4l,   gp64,   dnapol  and  egt  genes  were  searched.  Seven 
baculovirus  plO  genes  and  five  gp64  genes  were  found.  For 
the  gp41  gene,   five  complete  nucleotide  sequences  were 
obtained  from  GenBank  and  two  partial  sequences  of  LdMNPV 
and  Xestia  c -nigrum  granulovirus   (XcGV)    (Dr.   Goto,  personal 
communication)   were  included  for  further  analysis.  Five 
gp64  and  eight  egt  genes  from  different  baculoviruses  were 
also  included  in  this  study.     Finally,   dnapol  genes  of  six 
baculoviruses  and  two  other  insect  viruses  {Spodoptera 
ascovirus,  ASV,   and  Choristoneura  biennis  entomopoxvirus, 
CbEPV)   were  included  for  the  phylogenetic  studies. 

Reconstruction  of  Phyloaenetic  Trees  of  Baculovirus  Genes 


The  nucleotide  sequences  obtained  from  BLAST  and  ENTREZ 
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programs  were  analyzed  using  the  Wisconsin  Sequence  Analysis 
Package™   (Version  8.1  VMS  for  VAX  computer;  Genetic 
Computer  Group) .     Amino  acid  sequences  were  translated  from 
the  nucleotide  sequence.     The  multiple  sequence  alignment  of 
both  nucleotide  and  amino  acid  sequences  were  first  produced 
using  the  Pileup  program.     The  aligned  multiple  sequences 
were  realigned  using  the  CLUSTAL  program   (Higgins  et  al . , 
1996)   because  of  its  accuracy  for  low  homologous  sequence 
comparison . 

MEGA   (Kumar  et  al .  .    1993)    and  PAUP   (Swafford,  1990) 
computer  programs  were  used  to  reconstruct  the  phylogenetic 
tree  based  on  the  final  aligned  sequences  that  were  produced 
by  CLUSTAL.     The  p-distance   (proportion  distance)  and 
maximum  parsimony  methods    (Fitch,   1971)   were  used  for 
reconstructing  phylogenetic  trees  based  on  nucleotide 
sequence  data.     The  p-distance  and  neighbor- joining  methods 
(Saitou  &  Nei,   1987)   were  chosen  for  reconstructing 
phylogenetic  trees  based  on  the  amino  acid  sequence  data. 
The  bootstrap  test  with  500  replications  was  done  to  show 
the  reliability  of  the  constructed  trees  using  the  neighbor- 
joining  method.     The  bootstrap  result  was  given  in  terms  of 
percentage  confidence  level. 
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Relationship  of  Baculoviruses  with  Insect  Hosts 

The  insect  host  families    (Hodges  et  al . ,    1983)  are 
presented  in  Figure  4.1.     The  family  name  of  the  insect  host 
in  parentheses  after  the  baculovirus  species  name 
corresponds  to  the  hosts  of  the  baculoviruses  used  in  this 
study  to  reconstruct  the  polh  gene  phylogenetic  tree.  The 
correlation  between  the  baculoviruses  and  their  insect  hosts 
was  studied  by  determining  whether  or  not  the  insect  hosts 
of  closely  related  baculoviruses  belong  to  the  same  family. 

Results 

PCR  Amplification  and  DNA  Sequencing  of  LdMNPV  gp41  Gene 

A  partial  sequence  of  the  LdMNPV  gp41  gene    (381  bp.)  was 
amplified  and  sequenced   (Appendix  E) .     A  baculovirus  late 
gene  motif  was  found  upstream  from  the  ATG  translation  start 
site    (-32  to  -28) .     The  translation  start  site  did  not  fit 
the  Kozak  principle  completely,   but  it  was  very  similar.  A 
AxxATGC  was  found  instead  of  the  theoretical  sequence 
AxxATG (A/G) . 

The  partial  LdMNPV  gp41  coding  sequence  was  compared 
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with  the  AcMNPV  gp41  coding  sequence,   and  it  showed  56% 
nucleotide  sequence  identity  and  76%  amino  acid  sequence 
similarity.     The  partial  LdMNPV  gp41  nucleotide  sequence  and 
translated  amino  acid  sequence  were  used  to  reconstruct  the 
phylogenetic  tree. 

Phylogenetic  Tree  of  Baculovirus  polh  Genes 

Twenty-six  amino  acid  sequences  from  different 
baculoviruses  were  obtained  from  translated  nucleotide 
sequences  of  published  data   (Table  4.1),   and  used  to 
reconstruct  the  phylogenetic  tree.     In  Figure  4.1,  the 
phylogenetic  tree  was  divided  into  three  main  branches:  the 
lepidopteran  NPVs,   the  lepidopteran  GVs  and  the  hymenopteran 
NPV   (NsSNPV) .     Within  the  lepidopteran  NPV  branch,   the  tree 
was  divided  into  two  main  groups  and  one  outgroup  branch. 
Group  I  included  AcMNPV,   BmMNPV,   AfMNPV,   ArMNPV,  AgMNPV, 
HcMNPV,   ArcMNPV,   OpMNPV,    PnMNPV,   and  CfMNPV.     Group  II 
included  OpSNPV,    BsSNPV,    PfMNPV,    LsMNPV,    MbMNPV,  SlMNPV, 
SeMNPV,    SfMNPV,   MnMNPV,   HzSNPV  and  SpliMNPV.     The  only 
member  of  the  outgroup  branch  was  LdMNPV   (for  complete  name 
of  baculoviruses,   see  Table  4.1).     The  tree  reliability  test 


using  bootstrap  analysis  showed  a  low  confidence  level  of 
53%    (Fig  4.1)    in  the  branch  that  divides  group  I  and  II. 

Maximum  parsimony  analysis  of  the  nucleotide  sequences 
of  25  baculovirus  polh  genes   (Fig.   4.2)    showed  two  main 
branches  of  lepidopteran  NPVs,   and  agreed  with  the  grouping 
profile  from  the  phylogenetic  tree  based  on  the  amino  acid 
sequences   (Fig  4.1) .     Group  II  was  divided  into  two 
subgroups.     Subgroup  A  included  the  LsMNPV,   MbMNPV,  PfMNPV, 
SfMNPV,   SlMNPV  and  SeMNPV,   and  subgroup  B  included  HzSNPV, 
MnMNPV,   BsSNPV  and  OpSNPV. 

The  distance  lengths  between  lepidopteran  GVs  and. 
lepidopteran  NPVs  was  calculated  to  be  0.3  to  0.4,   and  to  be 
0.56  between  lepidopteran  GVs  and  NsSNPV    (Fig.  4.1). 

Phylogenetic  Trees  of  plO.   gp41  and  gp64  Genes 

The  phylogenetic  trees  of  baculovirus  genes  coding  for 
the  structural  proteins,  plO,   gp41  and  gp64  were  presented 
in  Figure  4.3    (based  on  the  amino  acid  sequences)   and  Figure 
4.4    (based  on  the  nucleotide  sequences) .     The  plO  gene 
phylogenetic  tree  based  on  the  amino  acid  sequence  showed 
that  AcMNPV  and  BmMNPV  were  in  the  same  group.     OpMNPV  and 
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PnMNPV  were  also  closely  related.     The  results  also  showed 
that  the  SlMNPV  was  distantly  related  to  the  other  NPVs  that 
were  analyzed.       When  the  plO  gene  phylogenetic  tree  based 
on  the  nucleotide  sequence   (Fig.   4.4  A)   was  compared  with 
the  tree  based  on  the  amino  acid  sequence    (Fig  4.3  A) ,  the 
results  showed  some  differences.     The  OpMNPV  and  PnMNPV  were 
distantly  related  to  AcMNPV  and  BmMNPV  based  on  nucleotide 
sequences,   whereas  SeMNPV  and  SlMNPV  were  distantly  related 
in  the  amino  acid  based  tree. 

The  gp41  gene  phylogenetic  tree  based  on  the  amino  acid 
sequences    (Fig.   4.3  B)   groups  the  AcMNPV,   BmMNPV  and  AgMNPV 
together,   and  LdMNPV  and  HzSNPV  in  a  separate  group.  The 
SfMNPV  was  distantly  related  to  these  two  groups.  The 
results  also  positioned  the  XcGV  as  an  outgroup.     The  gp42 
gene  phylogenetic  tree  based  on  the  nucleotide  sequences 
(Fig.   4.4  B)   agrees  with  the  tree  based  on  amino  acid 
sequences.     The  gp64  gene  phylogenetic  tree  based  on  amino 
acid  sequences    (Fig.   4.3  C)   presented  the  AcMNPV,   BmMNPV  and 
GmMNPV  in  one  group,   and  OpMNPV  and  CfMNPV  in  a  second 
group.     However,   the  phylogenetic  tree  based  on  the 
nucleotide  sequence    (Fig.   4.4  C)    showed  that  BmMNPV  was 
closer  to  OpMNPV  and  CfMNPV  than  to  AcMNPV  and  GmMNPV. 
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Phylogenetic  Trees  of  dnapol  and  egt  Genes 

The  dnapol  gene  phylogenetic  tree  based  on  the  amino 
acid  sequences  from  six  baculoviruses    (Ahrens  &  Rohrmann, 
1996),   one  ascovirus  and  one  entomopoxvirus   (Pellock  et  al . , 
1996)   was  reconstructed  and  showed  that  AcMNPV,  BmMNPV, 
CfMNPV  and  OpMNPV  were  closely  related,   while  HzSNPV  and 
LdMNPV  were  grouped • separately   (Fig.   4.5  A).     The  results 
also  indicated  that  SAV  and  CbEPV  were  distantly  related  to 
baculoviruses.     The  phylogenetic  tree  obtained  from  the 
nucleotide  sequence  data   (Fig.   4.6  A)   confirmed  these 
results .  • 

The  egt  gene  phylogenetic  tree    (Barrett  et  al . ,  1996) 
showed  that  AcMNPV  and  BmMNPV  group  together,   while  CfMNPV 
and  OpMNPV  form  another  group,   and  the  LdMNPV  and  MbMNPV  a 
third  group.     S.   littoralis  MNPV   (SpliMNPV;  abbreviated  to 
distinguish  it  from  S.   litura  MNPV  which  is  abbreviated 
S1MNPV)   was  distantly  related  to  the  other  lepidopteran 
NPVs .     LoGV  was  considered  to  be  an  outgroup  virus  in  this 
analysis.     Both  phylogenetic  trees  based  on  the  amino  acid 
and  nucleotide  sequences  agreed  with  each  other   (Fig  4.5  B 
and  4.6  B) . 
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Relationship  of  Baculoviruses  and  Their  Hosts 

The  family  name  of  the  insect  hosts  of  baculoviruses 
was  also  shown  in  Fig  4.1.     When  hosts  were  compared  with 
the  polh  gene  phylogenetic  tree,    the  results  showed  a 
certain  level  of  correlation  between  hosts  and 
baculoviruses.     For  example,   the  hosts  of  OpMNPV,   CfMNPV  and 
PnMNPV  that  were  closely  related,   belonged  to  the  family 
Lymantriidae .     Also,   most  NPVs  from  group  II  including 
MbMNPV ,    FfMNPV,    SfMNPV,    SeMNPV,    SiMNPV,    and  HzSNPV  infected 
hosts  from  the  family  Noctuidae . 

Congruent  Analysis  of  Baculovirus  Genes 

A  congruent  analysis  based  on  combined  baculovirus  gene 
data  sets  was  compared  with  six  independent  phylogenetic 
trees  of  baculovirus  genes.     The  six  genes  included  polh, 
dnapol,    egt,  plO,   gp41  and  gp64  of  AcMNPV,   BmMNPV ,  OpMNPV 
and  PfMNPV.     The  phylogenetic  tree  of  combined  sequence  data 
was  reconstructed  and  compared  to  each  single  gene  tree. 
The  results  did  not  show  any  difference  between  the 
universal  tree  that  was  based  on  the  combined  sequence  data 
and  each  single  gene  tree. 
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Discussion 


This  study  presents  for  the  first  time  an  analysis  of 
the  evolutionary  relationships  among  baculoviruses  using 
phylogenetic  trees  based  on  multiple  baculovirus  genes. 
A  congruent  analysis  was  made  in  order  to  alleviate  problems 
in  previous  evolutionary  studies  that  were  based  on  a  single 
baculovirus  gene   (Rohrmann,   1986;   Zanotto  et  al . ,   1993)  .  It 
has  been  a  challenge  to  determine  whether  or  not  the  gene 
tree  can  represent  the  species  tree  for  evolutionary  studies 
(Li  &  Graur,   1991)  .     A  congruent  analysis  was  used  in  an 
attempt  to  solve  this  problem.     Although  only  six  gene 
sequences  of  four  baculovirus  species  were  available  for 
comparison  in  this  study,   more  gene  sequence  data  will 
become  available  for  congruent  analyses  of  baculoviruses  in 
the  future.     Currently,   there  are  no  guidelines  for  how  many 
genes  and  species  need  to  be  tested  to  support  a 
phylogenetic  tree  based  on  congruent  analysis.  The 
development  of  PCR  and  automatic  DNA  sequencing  techniques 
is  rapidly  increasing  the  number  of  available  sequences  and 
will  help  improve  data  analysis. 
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The  congruent  approach  also  allowed  us  to  find  out  if 
an  independent  gene  tree  agrees  with  other  gene  trees 
including  the  universal  tree.     The  results  suggested  that 
the  polh  gene  is  a  useful  marker  to  represent  the  universal 
tree  and/or  the  baculovirus  species  tree.     In  addition,  the 
phylogenetic  tree  of  polh  gene  can  be  used  to  identify  a 
newly  isolated  baculovirus.     Two  reasons  have  been  found  for 
using  the  polh  gene  tree  to  represent  the  baculovirus 
species  tree  in  this  study.     First,   the  polh  gene  group  has 
the  biggest  nucleotide  sequence  group  that  is  currently 
available  and  include  nucleotide  sequences  from  25 
baculovirus  species  and  amino  acid  sequences  from  26 
baculovirus  species.     Second,   the  polh  gene  tree  agrees  with 
all  other  gene  trees  that  have  oeen  tested  in  this  study. 
No  significant  difference  was  found  between  zhe  polh  gene 
tree  with  other  gene  trees.     The  data  of  the  polh  gene  also 
agree  with  the  universal  tree  that  in  total  represents 
around  6%  of  genomic  DNA   (based  on  AcMNPV)   and  4%  of  the 
total  potential  encoded  genes.       Since  it  compares  to  these 
available  baculovirus  gene  sequences,   the  polh  gene  is 
considered  as  the  most  reliable  and  useful  gene  to  represent 
the  phylogenetic  tree  for  baculovirus  species. 
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In  comparison  to  the  data  published  by  Zanotto  et  al . 

(1993)  ,   the  polh  gene  phylogenetic  tree  in  this  study  was 
reconstructed  with  newly  available  sequences.     The  results 
showed  that  there  were  three  main  branches  including 
lepidopteran  NPVs,    lepidopteran  GVs  and  a  hymenopteran  NPV. 
They  also  indicated  that  lepidopteran  NPVs  can  be  divided 
into  two  groups,    I  and  II.     Lepidopteran  group  II  can  be 
further  subdivided  into  several  subgroups  as  Cowan  et  al . 

(1994)  suggested.     The  divergence  of  lepidopteran  NPV 
subgroups  may  be  indicative  of  an  ongoing  evolutionary 
pathway  for  baculoviruses .     More  careful  examinations  of  the 
evolutionary  rate  such  as  nucleotide  substitutions  per 
nucleotide  site   (Aotsuka  ef.  al .  ,    1994)    is  needed  to 
determine  if  subgroups  I  and  II  will  become  well-separated 
branches . 

Overall,   there  is  a  59%  nucleotide  sequence  identity  of 
the  polh  gene  between  lepidopteran  NPVs  and  lepidopteran 
GVs,   and  there  are  74%  to  92%  identities  among  lepidopteran 
NPVs   (Rohrmann,   1992)  .     In  addition,   the  baculovirus  polh 
gene  has  a  functional  counterpart  to  the  cytoplasmic 
polyhedrosis  virus    (CPV)    RNA  viral  family.      In  1989,  Fossies 
et  al .   characterized  the  gene  that  encoded  for  the  major 
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protein  present  in  the  proteinaceous  occlusion  body 
(polyhedrin  gene)    for  Euxoa  scandens  CPV.     The  homologies  of 
polh  gene  amino  acid  sequences  between  OpNPVs    (OpMNPV  and 
OpSNPV)   and  OpCPV  are  as  little  as  12%    (Galinski  et  al . , 
1994).     Although  the  nucleotide  sequence  identities  between 
NPVs  and  CPVs  are  very  low,   their  polh  protein  functions  are 
very  similar.     The  dissimilarities  between  NPVs  and  CPVs 
such  as  low  identities  of  polh  gene  sequences  and  different 
types  of  genome  structure    (DNA  viruses  vs.   RNA  viruses) 
indicated  that  their  polh  genes  may  involve  a  convergent 
evolution . 

The  dnapol  gene  is  used  for  comparison  between 
baculoviruses  and  other  insect  DNA  viruses,   because  it  is 
the  most  common  gene  among  DNA  viruses  from  different 
organisms.     It  has  been  reported  that  the  AcMNPV  dnapol  gene 
is  classified  in  the  viral  subgroup  of  dnapol  gene  family  B, 
and  is  related  to  the  dnapol  gene  of  human  virus  and 
eukaryotic  organisms  such  as  fungi    (Heringa  &  Argos,  1994). 
In  this  study,   the  results  showed  that  baculovirus  group  had 
evolutionary  paths  independent  of  other  enveloped  insect  DNA 
viruses,   SAV  and  CbEPV.     This  agrees  with  previous  published 
results   (Pellock  et  al . ,   1996)   and  is  not  surprising, 
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because  these  DNA  viruses  have  different  genomic  DNA 
replication  and    viral  infection  strategies    (Tanada  &  Kaya, 
1993)  . 

A  previous  published  phylogenetic  tree  of  the 
baculovirus  egt  gene   (Barrett  et  al . ,    1995)  was 
reconstructed  and  compared  to  the  phylogenetic  trees  of  polh 
and  dnapol  genes  to  examine  the  true  topology  of  baculovirus 
phylogenetic  trees.     No  significant  difference  was  found 
between  the  egt  gene  and  the  other  gene  trees.     Although  it 
is  very  common  to  find  that  different  gene  trees  have 
different  topologies    (Forterre,    1997),   the  comparison 
between  polh,   dnapol  and  egt  gene  trees  showed  that  these 
genes  have  similar  evolutionary  paths  and/or  rates.  Since 
all  the  analyzed  trees  agreed  with  each  other,   it  should  be 
reasonable  to  predict  the  evolutionary  pathway  based  on  a 
single  gene  tree  such  as.  polh  gene. 

Furthermore,   three  phylogenetic  trees  based  on 
baculovirus  plO,   gp41  and  gp64  were  reconstructed  in  this 
study.     For  the  phylogenetic  trees  of  the  baculovirus  plO 
gene,   the  results  showed  different  schemes  based  on  either 
the  nucleotide  sequences  or  amino  acid  sequences.  The 
inconsistences  may  be  caused  by  the  low  homology  of  plO 
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genes  among  baculoviruses    (20  to  40%  identities  of  amino 
acid  sequences).     Kumar  et  al .    (1993)    found  that  it  is  easy 
to  misinterpret  the  results  when  low  homology  sequences  are 
used  to  construct  phylogenetic  trees.     However,    it  could 
also  be  caused  by  using  different  methods  when  the 
phylogenetic  trees  were  reconstructed.     In  this  study,  the 
nucleotide  sequences  were  analyzed  using  the  maximum 
parsimony  method,   while  the  amino  acid  sequences  were 
analyzed  using  the  neighbor- joining  method.     These  two 
methods  have  completely  different  algorithms,   which  may 
explain  why  the  plO  gene  trees  based  on  different  types  of 
data  did  not  agree  with  each  other.     Since  there  is  no 
evidence  to  suggest  one  method  is  superior  to  other,    it  is 
probable  that  the  plO  gene  group  has  a  higher  evolutionary 
rate   (more  nucleotide. substitutions  per  site)   than  other 
gene  groups . 

The  phylogenetic  trees  of  the  baculovirus  gp41  and  gp64 
genes  show  similar  topologies  based  on  either  nucleotide  or 
amino  acid  sequences.     In  this  study,   two  partial  sequences 
were  used  to  reconstruct  the  gp41  gene  phylogenetic  tree.  . 
Partial  sequences  coding  for  highly  conserved  domains  have 
been  used  for  reconstructing  a  dnapol  gene  phylogenetic  tree 
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(Heringa  &  Argos,    1994)    and  show  a  reliable  result.  In 
general,   the  phylogenetic  tree  of  baculovirus  gp41  genes 
agree  with  other  gene  trees.     However,   it  is  necessary  to 
note  that  incomplete  sequences  may  sometimes  result  in  a 
dissimilarity  with  other  trees. 

Only  five  sequences  were  used  to  reconstruct  the 
phylogenetic  tree  of  gp64  genes.     Two  functional  domains  of 
gp64  proteins  have  been  identified   (Monsma  &  Blissard, 
1995) .     It  will  be  helpful  to  compare  these  specific  domain 
in  a  protein  function  study,   and  to  reconstruct  the 
phylogenetic  tree  using  the  comparison  of  function  domains. 
Moreover,   a  glycoprotein  of  a  togoto  virus    (a  tick-borne 
orthomyxo- like  virus)   shows  homology  with  the  gp64  gene  of 
bacul  oviruses    (Morse  et  3.1 .  ,   1992)  .     Even  though  the  amino 
acid  sequence  identity  between  the  gp64  gene  and  the  togoto 
glycoprotein  gene  is  low   (28-33%) ,   similarities  between 
their  hydrophobicity  profiles  and  the  conserved  cysteine 
sites  are  highly  significant.     Again,   it  indicates  that 
protein  functional  domains  are  highly  conserved  during 
evolutionary  processing. 

In  order  to  understand  the  host  inference  on 
baculovirus  evolution,   the  phylogenetic  relationships  of 
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baculoviruses  were  compared  with  their  hosts..     The  results 
showed  that  several  branches  of  baculoviruses  have  the  same 
host  family.     Most  of  the  baculovirus  in  lepidopteran  NPV 
subgroup  II  appear  to  infect  the  insect  family  Noctuidae . 
Two  closely  related  lepidopteran  NPV  subgroup  II  branches 
including  the  branch  of  LsMNPV,   MbMNPV,   and  PfMNPV,    and  the 
branch  of  SfMNPV,   S1MNPV  and  SeMNPV  are  found  to  infect  the 
same  host  family   {Noctuidae)   with  closely  related 
subfamilies    (Hadeninae  and  Amphipyrinae) .     Based  on  the 
results,    it  can  be  suggested  that  the  lepidopteran  NPV 
subgroup  II  has  undergone  a  host --dependent  evolution.  On 
the  other  hand,   the  lepidopteran  NPV  subgroup  I  was  more 
diverse  than  group  II.     Some  baculovirus  species  such  as  the 
branch  of  OpMNPV  and  PnMNPV,   and  the  branch  of  ArcMNPV  and 
CfMNPV  infect  closely  related  families  of  insect  hosts. 
These  two  closely  related  branches  infect  insects  from  the 
family  Lymantriidae ,   and  the  family  Tortriciidae  (Groner, 
1986;   Zanotto  et  al . ,   1993)   that  are  closely  related.  This 
also  shows  that  a  host -dependent  evolutionary  pathway  could 
exist.     However,   the  rest  of  lepidopteran  NPV  subgroup  I 
species  did  not  have  strong  associations  with  the  same 
family  of  insect  hosts.     It  implies  that  these  species  such 
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as  AcMNPV   (host  family,   Noctuidae)  ,   BmMNPV   (hose  family, 
bowbycide)  ,  ArMNPV   (host  family  Saturniidae)   and  HcMNPV 
(host  family,   Arctiidae)   are  host- independent  and  go  through 
a  nonparallel  divergence  from  their  hosts.     The  way  that 
agriculture  systems  were  involved  in  distributing  the 
baculoviruses  may  indirectly  result  in  evolutionary  changes 
of  baculoviruses. 

Some  association  between  viruses  and  their  geographic 
distribution  has  been  reported. (Fenner  &  Kerr,   1994;  Zanotto 
et  al .  ,   1993  ;   1995).     The  genetic  distance  of  tick-borne 
encephalitis  was  found  to  be  correlated  with  the  geographic 
distance   (Zanotto  et  al . ,   1995).     However,   no  significant 
evidence  was  found  in  this  study  to  support  such  a 
correlation  for  baculoviruses.     Most  baculoviruses  that  were 
analyzed  in  this  study  are  distributed  all  over  the  world 
from  North  America,   South  America,   Europe,   the  Middle  East, 
and  Asia.     Thus,   the  geographic  distribution  of 
baculoviruses  does  not  appear  to  be  associated  with  their 
genetic  distances.     Although  a  geographic  correlation  with 
genetic  distance  was  found  among  GVs  in  South  East  Asia  and 
AgMNPV  in  South  American    (Zanotto  et  al . ,    1993),  this 
correlation  was  applied  only  to  the  strains  of  the  same 
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viral  species  instead  of  to  the  different  viral  species. 
Although  no  correlation  between  geographic  distance  with 
genetic  distance  was  found  among  baculoviruses ,  the 
understanding  of  their  relationship  cannot  be  ignored  since 
it  is  necessary  for  a  complete  evolutionary  study.  In 
addition,   the  evolutionary  pathway  of  baculoviruses  may  be 
related  to  other  factors  such  as  the  feeding  preferences  of 
baculovirus  insect  hosts.     Certain  closely  related 
baculoviruses  were  found  tc  infect  specific  types  of  insect 
hosts  such  as  forest  pests,   crop  pests  and  vegetable  pests. 
More  studies  will- be  needed  to  ascertain  whether  or  not  this 
factor  really  plays  any  role  in  baculovirus  evolutionary 
paths . 

In  conclusion,   the  congruent  analysis  done  in  this 
study  validates  the  evolutionary  hypothesis  of  baculoviruses 
as  suggested  by  Rohrmann   (1986  &  1992)   and  Zanotto  et  al . 
(1993).     The  results  confirm  that  hymenopteran  NPV  diverged 
early  from  lepidopteran  NPVs  and  GVs,   and  that  the 
lepidopteran  NPVs  and  GVs  then  split.     Lepidopteran  NPVs 
continued  to  evolve  and  become  two  subgroups  I  and  II,  and 
subgroup  II  diverged  into  several  subgroups  again.     In  the 
future  more  information  obtained  from  other  NPVs  such  as 
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hyraenopteran  NPVs,   dipteran  NPVs  and  decapodan  NPVs  (shrimp) 
will  help  in  understanding  the  complete  evolutionary  pathway 
of  baculoviruses .     The  results  of  this  study  also  suggest 
that  the  phylogenetic  tree  of  polh  gene  can  be  used  to 
represent  the  baculovirus  species  tree.     The  comparison  of 
the  polh  tree  with  five  other  genes  and  the  universal  tree 
shows  no  significant  differences  and  suggests  that  the  polh 
gene  is  a  reliable  gene  for  evolutionary  studies  of 
baculoviruses.  ' 


CHAPTER  5 
SUMMARY 


In  this  study,   a  baculovirus  conserved  gene,   gp41,  was 
used  as  a  model  to  study  the  phylogenetic  relationship  among 
baculoviruses .     The  transcriptional  analysis  and  protein 
secondary  structure  of  the  gp41  gene,   and  the  structural 
analysis  of  the  surrounding  genomic  region  were  also 
studied . 

Two  complete  gp41  gene  nucleotide  sequences  from 
Spodoptera  frugiperda  multiple  nucleocapsid 
nucleopolyhedrovirus   (SfMNPV-2)   and  Anticarsia  gemmatalis 
MNPV   (AgMNPV- 2D) ,   and  a  partial  gp41  gene  from  Lymantria 
dispar  MNPV   (LdMNPV)   were  sequenced.     The  SfMNPV-2  gp41 
contained  999  nucleotides  and  encoded  332  amino  acids.  Two 
SfMNPV-2  gp41  gene  transcripts  were  detected  12  hours  post- 
infection.    Primer  extension  analysis  demonstrated  that  the 
gp41  gene  promoter  region  contained  three  transcriptional 
start  sites.     Two  of  them  were  in  the  first  two  nucleotides 
of  a  consensus  transcriptional  start  site   (TAAG)  of 
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baculovirus  late  genes,   and  another  transcriptional  start 
site  was  located  in  a  region  where  no  consensus  motif  had 
been  determined   (-140  nucleotide  from  the  translation  start 
codon,   ATG) . 

The  AgMNPV- 2D  gp41  gene  contained  1,005  nucleotides  and 
encoded  334  amino  acids.     Comparison  of  the  nucleotide  and 
amino  acid  sequences  of  the  AgMNPV- 2D  with  four  other  NPVs 
including  Autographa  calif ornica  MNPV   (AcMNPV) ,   Bombyx  mori 
MNPV   (BmMNPV) ,   SfMNPV  and  Helicoverpa  zea  single 
nucleocapsid  nucleopolyhedrovirus    (HzSNPV)    showed  a  minimum 
of  59%  nucleotide  identity  and  70%  amino  acid  similarity. 
Analysis  of  the  protein  secondary  structure  and  amino  acid 
sequence  alignment  of  AgMNPV-2D  gp41  gene  revealed  several 
conserved  domains  including  eight  cc-helix  domains,  four 
loop  domains,   one  P- sheet  domain  and  one  transmembrane 
domain.     Furthermore,   the  hydrophobicity  analysis  of  the 
gp41  gene  showed  five  conserved  domains.     Domain  III  was 
correlated  with  one  of  the  conserved  ct-helix  domains 
(qQLaNnYxTLLLkr) ,   and  domain  V  was  correlated  with  the 
transmembrane  domain   (EnxxxxAPLSAxxxIFxxx) .     The  genomic 
structure  of  the  AgMNPV- 2D  gp41  region  also  contained  vlf-1 
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gene,   ORF  330,   ORF300  and  ORF  >667,    showing  a  similar 
arrangement  with  AcMNPV,   BmMNPV  and  SfMNPV.     On  the  other 
hand,   this  region  had  a  different  genomic  location  and 
transcriptional  orientation  in  HzSNPV.     Among  these  ORFs, 
the  AgMNPV- 2D  shared  50  to  70%  nucleotide  identity  and  60  t 
90%  amino  acid  similarity  to  the  four  other  NPVs . 

Lastly,   six  baculovirus  genes  including  polyhedrin 
{polh) ,  plO,   gp41,   gp64,   DNA  polymerase   (dnapol)  and 
ecdysteroid  UDP-glucosyltransf erase   (egt)   were  used  to 
construct  phylogenetic  trees.     The  phylogenetic  trees 
confirmed  that  the  hymenopteran  NPVs  diverged  earlier  from 
the  lepidopteran  granuloviruses    (GVs)   and  lepidopteran  NPVs 
Later,   the  lepidopteran  GVs  diverged  from  lepidopteran  NPVs 
The  results  also  showed  that  AcMNPV  was  closely  related  to 
BmMNPV,   and  that  Orgyia  pseudosugata  MNPV   (OpMNPV)  was 
closely  related  to  Perina  nuda  MNPV   (PnMNPV)  and 
Choristoneura  fumiferana  MNPV   (CfMNPV) .     The  phylogenetic 
analysis  of  dnapol  showed  that  the  baculoviruses  had 
independent  evolutionary  paths  when  compared  to  two  other 
insect  DNA  viruses,   Spodoptera  ascovirus   (SAV)  and 
Choristoneura  fumiferana  entomopoxvirus    (CbEPV) .  In 
conclusion,   this  is  the  first  time  that  phylogenetic  trees 
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from  six  different  baculovirus  genes  were  constructed  to 
study  the  evolutionary  paths  of  baculoviruses . 

In  the  future,   additional  molecular  data  (nucleotide 
sequence,   amino  acid  sequence,   and  three  dimensional  protein 
structure  data)   of  baculoviruses  will  become  available. 
These  basic  data  can  be  used  to  construct  phylogenetic  trees 
of  different  baculovirus  genes,   and  to  predict  the 
biological  function  of  a  particular  baculovirus  gene  using 
computer  modeling  systems . 


APPENDIX  A 

NUCLEOTIDE  SEQUENCE  OF  Spodoptera  frugiperda  MNPV  EcorRI-S 
FRAGMENT  AND  TRANSLATED  AMINO  ACID  SEQUENCE  OF  GP41  GENE 


1  GAATTCAATG  TCGCTTTTAA 

51  AAGCGTGCAA  ACCGCCTATA 

101  ACATTATACA  TTTTTCATGG 

151  TTCTTGCTAT  ATCTAAACAA 

2  01  TTACTGCCCT  CATAAAAAAC 

251     ATAACTAAAT  CGTCGACAAT 
I     T     K     S       S     T  M 

3  01     CGCGGCCATA  ACCGAACCGT 

A     A     I       T     E     P  W 
351     AAATCGTTCG  ATACTATAGA 

I     V     R       Y     Y  R 
401     ATGCTAAACC  TCATCAACAC 

M     L     N     L  INT 
451     CGTAGACGTC  AACGCCACCA 
V     D     V       N     A     T  K 
501     ACAATTACAA  ACGACTGCAA 

N     Y     K       R     L  Q 
551     GACATTTTCA  AAGCTTCGTT 

D     I     F     K  ASF 
601     AAAATTTTAC  AACAAGGGCG 
K     F     Y       N     K     G  G 
651     AAGCGGCCCG  TCATTTGGGC 

A     A     R       H     L  G 
701     GTGACCACAA  ACACACCCAT 

V     T     T     N       T     P  I 
751     CGATTATCTA  ACGTTGCTTC 
D     Y     L        T      L     L  L 
801     AGGAGATCAT  CAACAGCGGC 

E     I     I       N     S  G 
851     ATGATCAACG  CTCTCATCAA 

M     I     N     A  LIN 
901     CAGTGACTAT  TATCTGTACG 
S     D     Y       Y     L     Y  V 
951     TAAGTTTGAA  AGAAAATATC 
S     L     K       E     N  I 
1001     AACATATTCA  ACTTTATCGC 

N     I     F     N       F     I  A 
1051     GAGCGTGTTC  CAGAGCGCTT 
S     V     F       Q     S     A  S 
1101     TCGTCAGCGA  ATCCAAAAAC 

V     S     E       S     K  N 
1151     TTTGAAAATG  AAGCATTAAG 
F     E     N     E       A     L  R 


AAATTGCGAA  AGCATTCTGT  GTAAACGTAG 
TCACCATGGC  GGTTATAGTT  TTATTTATCA 
TATTTTGTAC  TATTTATTTT  CGTAATGATA 
TAATTATATG  ATAAGTAATC  CAAAAATTGT 
ACAA TGGCCA  ATTACACGAG  GCCAAATTCA 

MAN       YTR  PNS 
GTCATCGTCT  TCGTTGTCGT  CGTCCTCGTC 

SSS       S     L     S     S  SSS 
GGATGGACAA  ATGTGTCGAT  TACGTCAATA 

MDK       CVD  YVNK 
ACAAACGACA  TGTCTCAATT  GACCCCACAA 
TNDM       SQL  TPQ 
CATACGGAAT  GTTTGCATCG  AAACGTATCC 

IRN       VCIE  TYP 
AGCGTTTCGA  CAGCGACGTC  AACCTTATGA 

RFD       SDV  NLMN 
AAAGAGCTGG  GCAATAAACC  GATCACGAGC 
KELG       NKP  ITS 
CGTGTACAGC  GTTTTGCCGT  CGTACGCTCA 

VYS       VLPS  YAQ 
GCGATCATCT  AGCCAGCGGC  AGCGTCGAAG 

DHL       ASG  SVEE 
TACGCTTTAC  AATATCAAAT  CGCGCAAGCT 
YALQ       YQI       A     Q  A 
CCCCCTGCCG  TTCGATCAAC  AGCTTGCCAA 

PLP       FDQQ  LAN 
TGCAGCGAGC  CAACATTCCG  ACAAACATAC 

QRA       NIP  TNIQ 
AATCGGACGC  ACGGCAACTC  GCGCGTTCAC 
NRTH       GNS  RVH 
CAACGTGATC  GACGATCTGT  TTGCCGGCGG 

NVI       DDLF  AGG 
TGCTCAACGA  AACTAACAAA  TCTCGCATTC 

LNE       TNK  SRIL 
AGTTACATGG  CACCATTGTC  CGCCACCACT 
SYMA       PLS  ATT 
AACGCTCGCC  ACCAATTCGG  GTAAAAAGCC 

TLA       TNSG  KKP 
CGATGTTGAC  CATGCCTCTA  ACTAAACCTG 

MLT       MPL  TKPV 
GTGTGCCAAC  AGCAACTGAC  TGAACTGGCG 
VCQQ       QLT  ELA 
AAGATTTATC  TTGCAACAGT  TAAGTTATAA . 
RFI       LQQL  SYK 
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1201  AAACGACATT 
N     D  I 

12  51  GAGTCTTGAC 

13  01  ATATTCCGCT 

13  51  GACAACAATG 

14  01  TGATATTAAT 
1451  GTTCAAAACA 
1501  TTTTGTATTT 
1551  AGACAGACAA 
16  01  ACTGATAACA 
1651  TAATTGTTGG 
1701  TTCAACTGGC 
1751  GGTCGGTGGT 
1801  GAAAAGTTAC 
18  51  AGTCTCATCT 


TCGCAACTGT  GATAACAACT 
SQL* 

GTTCCGTACG  AACGTTTAGG 
AAAATTAGCT  TTGACTGATT 
ACGACAATCA  AAAAAACAAC 
CAATCAAACG  CCAACAATTA 
ACAGTTTTAC  GACATTTTAG 
TGGTATTGCT  GTATGCTATA 
AAATCCAACA  CTATAAGACC 
TTTTCAATAA  AACAAACAAT 
AAATCCAAAA  TCCAAAGTCA 
CACCGATCGA  CAGCGATGCA 
CCAAGTTTAT  TTTTAACAAA 
AAGTCTAGAT  TCATCAAAAT 
CGACGAACTA  AACACCTACG 


GAGGCTAGAA  AAAAAAAGAT 

CACAGCGACC  AAAGTCGATT 
TACCTTCAGA  AAACACTTCA 
AAT AC C C AAA  ATCCCAAAAT 
TAATCAACAT  CAATCGGTTC 
TTTTAGGTAT  GCTGACAGTG 
TATTACTTTG  TTATATTAAG 
TAGTTATATG  TTTTAGCATG 
GTGAGAAATG  AATATTCGTT 
TTTTAGATTC  GAGACCGTGT 
CGCCCGACAA  GGTTCGTAAC 
CCGTTTGCGC  CCACCACATT 
CATCTACTGT  CTAATCGACG 
ATCTTAATCA  AGAATTC 


*  TAAG  is  the  transcription  start  site  of  baculovirus  late 
genes 

*  ATG  is  the  protein  translation  start  site 

*  AATAA  is  the  poly (A)   tail  signal  site 


APPENDIX  B 

INTERNET  SERVERS  USED  FOR  DATABASE  SEARCH  AND  PROTEIN 
SECONDARY  STRUCTURE  PREDICTION 


PROGRAM  URL  Site    (Server  Institute) 


Databank  search 

BLAST  http : / /www3 . ncbi . nlm . nih . gov/Blast / 

(National  Biotechnology  Information  Center,  USA) 

ENTREZ  http : //www3 . ncbi . nlm . nih . gov/Entrez/ 

(National  Biotechnology  Information  Center,  USA) 

Protein  secondary  structure 

Darwin  http : / /cbrg . inf . ethz . ch/ subsect ion3_l_7 . html 

(Swiss  Federal  Institute  of  Technology  Zurich) 

PHD  http : / /www. embl-heidelberg. de/predictprotein/ 

predictprotein.html 
(European  Molecular  Biology  Laboratory,  Germany) 

0- linked  glycosylation  site  prediction 

NetOglyc        http : / /genome . cbs . dtu . dk/netOglyc / 

cbsnetOglyc . html 
(The  Technical  University,  Denmark) 

Transmembrane  domain  analysis 

MEMSAT  http : //globin . bio . Warwick . ac . uk/~ j  ones / 

memsat . html 
(University  of.  Warwick,  U.K.) 
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APPENDIX  C 

NUCLEOTIDE  SEQUENCE  OF  Anticarsia  gemmatalis  MNPV  Pstl- 
Hindlll  FRAGMENT  AND  TRANSLATED  AMINO  ACID  SEQUENCE  OF  GP41 

GENE 


1     AAGCTTGCCG  ATACGCCGGT  GCCGCACCAT  GGGCCCGCCG  AAGAAGAAAA 
51     CGTTGTCGCC  GAACGCTTTG  GGTCAGAATC  TGGCGACGCA  CCATCGTCCC 
101     CCAAAAAGCA  AAAATTGGAC  GAGTCTGAAC  AAGATTAAAT  ACGACAGCGA 
151     ACTGCTCATT  CACTATCTAT  ACGAAGGGTT  TTGCACCGAC  AAACAACAAT 
2  01     GCAATTTGAA  CGTGATAAAA  ATTTACAAAG  TAAAAGTAAA  GAAAACGGGC 

2  51     GCTTCCATTT  TGGCACATTA  TTTTGCGCAA  ATTTCTACTT  CAAGCGGTTA 

3  01     CGAGTTTGAA  TTCCACCCCG  GCAGTCAGCC  TCGCACCTTT  CAAACGGTAC 

3  51     ACACCGACGG  TCTCATTATA  AAGGTGCACA  TTATGTGCGA  TGAATGCTGC 

4  01  AAAGCGGAAT  TGCGCAGATA  CATCAAAGGA  GAAAACGGCT  TCAACGTAGC 
4  51  GTTTCGCAAT  TGCGAAAGTA  TCCTGTGTCA  ACGTGTCAGT  TTTCAAACGC 
501  TTTTGCTGGG  ATGCGCCATT  CTGTTGCTGC  TGTTTAACGT  GGAAAAATTT 
551  TCGATATTAA  ATTTGCTTGT  CATTTTGTTA  CTTTTAGTAG  CGTTGTTTTG 
6  01  TAACAACAAT  TATATTATAA  GTAATCCATA  CGTTGTATTT  TGCAATCATA 
6  51     AGAACGCATT  AAAAAAC CAT  GAATGAACGG  GACGGCTTTT  ATTTGAACGT 

M       NER       DGFY  LNV 
701     TTCGCAGGCG  CCTGCGAGAC  ACCCGTTTGC  ACCCACCAGC  GCGACCGTTA 
SQA       PARH       PFA       PTS  ATVT 
751     CTAGTTCGCA  AAGCGGTAAT  TATCCAACCA  CAATGTCCAC  AATGGTGCAG 

SSQ       SGN       YPTT       MST  MVQ 
801     CGGACAGATC  GCGGCAGCGC  AAACTCGCTT  GTTAAAACCA  AAGAAGACGC 

RTDR       GSA       NSL       VKTK  EDA 
851     CAGCGGCGAA  TCTATTTGGT  ACAACAAGTG  CACAGACTAT  GTACATAAAA 
SGE       SIWY       NKC       TDY  VHKI 
901     TTATTCGCTA  TTATCGCTGT  AACGACATGT  CTGAATTGAC  TCCTTTAATG 

IRY       YRC       NDMS       ELT  PLM 
951     ATTCATTTTA  TCAACACAAT  ACGCGACATG  TGCATTGACA  GCAACCCTGT 
IHFI       NTI       RDM       CIDS  NPV 
1001     TAGTGTAAAC  ATAATCAAGC  GCGTGCAAAC  TGACGAAGAA  ATTGTTCGCC 
SVN       IIKR       VQT       DEE  IVRH 
1051     ACCTAATTGG  GTTGCAAAAA  GAACTGCGTC  AGAATAGCGT  GGCAGAGTCC 

LIG       LQK       ELRQ       NSV  AES 
1101     ATCGATTCGG  ATTCCAACAT  TTTTCAGCCT  TCGTTTGTAC  TCAATTCGCT 

IDSD       SNI       FQP       SFVL  NSL 
1151     GCCGGCGTAC  GCGCAAAAAT  TTTACAACGG  CGGCGCAGAC  ACGCTTGGCA 
PAY       AQKF       YNG       GAD  TLGK 
1201     AAGACGCGCT  CAACGAGGCG  GCCAAACAGC  TTAGTTTGGC  CGTGCAGTAC 

DAL       NEA       AKQL       SLA  VQY 
1251     ATGGTGTCGG  AAGCGGTCAC  GTGCAGTATT  CCCATCCCGT  TGCCGTTTGA 
MVSE       A     V     T       CSI       PIPL  PFD 
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13  01     CCAGCAGCTT  GCCAACAATT  ATGTGACACT  ACTTTTAAAA  CGCGCCACGC 

QQL       ANNY       VTL       L     L     K  RATL 
1351     TACCTGACAA  CGTGCAAGAA  GCCGTCAAGT  CGCGCAGCTT  TGTGCACATT 
PDN       VQE       AVKS       RSF  VHI 

14  01     AACATGATCA  ATGACCTCAT  AAATTCAGTG  ATTGACGATT  TGTTTGCTGG 

NMIN       DLI       NSV       IDDL  FAG 

1451  CGGCGGCAAC  TATTATTATT  ACGTGCTCAA  CGAAAAGAAT  CGCGCGCGCG 
GGN       YYYY       VLN       EKN  RARV 

1501  TCGTAGGGCT  CAAGGAAAAC  GTGGGATTTT  TGGCA CCA TT  GTCCGCGTCC 

VGL       KEN       VGFL       APL  SAS 

1551  GCGGACATTT  TTAATTACAT  GTCGCAACTT  GCTACGCGAC  ACGGCAAACG 

ADIF       NYM       SQL       ATRH  GKR 

1601  TCCCGACATG  TTTGAGAACG  CGGCGTTTCT  TACGTCGGCC  GCCAACGCCA 
PDM       FENA       AFL       TSA       A     N     A  I 

1651  TCAACTCGCC  GGCCGCCATT  TGACGCAGAG  CGCGTGCCAA  AAGAGTTTGT 

N     S     P       A     A     I  * 

1701  CTCAATTAGC  GGCGCAGTGT  GAAACGCTAA  CCCGGTTCAT  ATTCATGATT 

1751  GTCAAGCAAA  CTGACGCCGA  CAGATTACTG  AATCCGCCGC  GTTCGCGCGC 

1801  AATATGAGTT  TGTACAAGAA  CAAAGTGTGG  TGCGTGTACA  TTGTGCGGCG 

1851  AGACGACGGC  AAACTGTACA  CGGGCATCAC  CAGCGATTTG  CGACGTCGTT 

1901  TGAACCAACA  CAAACGCGGC  GTTGGTGCGC  GTTTTTTACG  CAATGCAAAC 

1951  TCTTTGCGTT  TACTTTATTG  CAGCGCAAAC  GCGTACGATT  ACAAGACCGC 

2  001  CGCGCAATTG  GAATACAATC  TTAAGCGTAA  ACGCGGGAAA  TATTTTAAAT 

2051  TGCAATTAAT  TAAAGCGCAA  CCTCAACATT  TGCATCAATA  TTTGTCATCA 

2101  TGAACTTGGA  CGTGCCCTAC  TACCGTTTGG  GCAACCACGA  GCGCGTAGAA 

2151  TACATTCCGC  TAAAACTAGC  GCTAAACGAC  GATGCGCCCG  TCAACAACAA 

2201  CAACGACGAC  ACTGCTGTGT  ACGAATACTC  GGACGTACAC  AAAGG CG AAA 

2251  CGCGCACGGG  TCAAATGTCG  GCCGGTTTAA  TTGTGCTGAT  TAGTCTGGTG 

23  01  GCGTTTGTGG  CTTTGTTTCT  GCTATTGTAT  GTTATCTATT  ATTTTGTAAT 

2351  ATTAAGAGAA  GAGCCGCAAT  ATTCTTCCGA  CACAATTGAC  AACAGCGATC 

2401  CTTCTTTTTT  GTTTAATAAA  TTTGATTAAT  TACAATGAAC  GAGCTGTTGA 

2451  ACGCACGCAA  CGAAAATGTT  TTTAACGATT  GGAAAATGCG  CATTCAATCA 

2501  GCGCCACAAT  TTGAGCACGT  GTTTGATTTG  GCCACCGACC  GACAGCGGTG 

2551  CACGCCGGAC  GAAGTAAAAA  ACGACAGCCT  GTGGAGCAAG  TACATGTTTC 

26  01  CCAAACCGTT  TGCGCCCACC  ACACTAAAAA  GTTACAAGTC  ACGTTTTATT 

2651  AAAATTATTT  TTAGCCTAAT  AGAGGAACCG  GATTTGCAAA  ACACCGCATA 

2  701  TTCGTTAAAC  AGGGAATTTG  ATTCGATTGA  ATATCAACGG  TTGCTTGTGA 

2  751  ACCCCAAAGA  ACTGTGCAAA  CGCATGCTTG  AATTGAGGTC  TGTGACCAAG 

28  01  GAAACGTTGC  AGCTTACCAT  TAACTTTTAC  ACAAACGCTA  TGGGTTTGGC 

2  851  CGAATTTAAA  ATCCCGCGCA  TGGTCATGTT  GCCACGTGAC  AAGGAACTTA 

2  901  AAACCATTCG  AGAAAAAGAA  AAAAATTTTA  TGCTCAAAAA  CGCAATAGAC 
2951  ACAATTTTGG  ATTTTATTAA  TTCCAAAATA  AAAATGTTAA  ACGGCGATTA 

3  001  CGTGCACGAC  CGCGGCCTTA  TTCGAGGCGC  CATAGTGTTT  TGCATAATGT 
3  051  TAGGTACGGG  CATGCGCATC  AACGAGGCGC  GCCAGCTCAG  CGTGGAAGAT 
3101  CTTAACGTGC  TAATTAAAAA  AGGCAAACTG  CGCAGCAACA  CTATCAATTT 
3151  AAAACGCAAA  CGCAGCCGCA  ACAACACGCT  CAACACGATC  AAAACCAAAC 

32  01  CGTTGGAGCT  GGCTCGTGAA  ATTTACGCGC  GCAACCCCAC  CGTGTTGCAA 
3251  ATCTCTAAAA  ACACTTCCAC  GCCTTTTAAA  GACTTTCGCC  GTTTGCTGGA 

33  01  CGAAGCGGGC  GTAGAAATGG  AACGGCCACG  TAGCAACATG  AT AAG AC AC T 
33  51  ATTTGAGCAG  CAATTTGTAC  AACAGCGGCG  TGCCGTTGCA  AAAGGTGGCG 
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34  01  CGCTTGATGA  ACCACGAATC  GCCGGCCAGT  ACCAAACCGT  ATTTGAACAA 
3451  GTACAATTTT  GACGAAAGCA  GCAGCAGCAC  GAGGAATCAG  AGTTGAACAA 
3  501     CCGCGACTCG  TCTGCAG 


*  TAAG  is  the  transcription  start  site  of  baculovirus  late 
genes 

*  ATG  is  the  protein  translation  start  site 

*  AATAA  is  the  poly (A)    tail  signal  site 


APPENDIX  D 

PURIFICATION  OF  POLYHEDRA,    ALKALINE -RELEASED  VIRUSES  AND  DNA 
FROM  Lymantria  dispar  MNPV  COMMERCIAL  FORMULATION  (MODIFIED 
FROM  THE  LABORATORY  PROTOCOL  OF  DR.  MARUNIAK) 


Purification  of  Polyhedra  from  LdMNPV  Commercial  Formulation 

1.  Dissolve  2  g  LdMNPV  in  10  ml  homogenization  buffer  (1% 
ascorbic  acid,   2%  SDS,    10  mM  Tris-HCl  and  1  mM  EDTA,  pH 
8.0). 

2.  Filter  the  polyhedra  solution  through  4  layers  of 
cheesecloth. 

3.  Centrifuge  the  solution  at  10,000  rpm  for  10  min  at  4°C 
(BECKMAN  J21-C  centrifuge  and  JA2  0  rotor) . 

4.  Discard  supernatant  and  resuspend  pellet  in  9  ml  of 
distilled  water  with  1  ml  5  M  NaCl . 

5.  Centrifuge  the  solution  at  10,000  rpm  for  15  min  at  4°C. 

6.  Resuspend  in  5  ml  distilled  water. 

7.  Make  a  30  ml  sucrose  gradient  from  63%  to  40%  in  TE 
buffer   (10  mM  Tris  and  1  mM  EDTA,  pH  8.0),   using  a 
gradient  former  (MBA,   Clearwater,  FL)   and  Masterflex  pump 
(Cole-Parmer  Instrument  Co.). 

8.  Centrifuge  the  resuspended  viral  solution  on  top  of 
sucrose  gradient  in  an  ultracent rif uge  at  24,000  rpm  for 
3  0  min  at  4°C   (DuPont  OTD  65B  ultracentrif uge  and  AH627 
swinging  bucket  rotor) . 

9.  Transfer  the  polyhedra  band  to  a  new  tube,   and  mix  with 
distilled  water. 

10.  Centrifuge  the  solution  at  10,000  rpm  for  15  min  at  4°C. 

11.  Resuspend  the  pellet  in  0.5  ml  distilled  water. 
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Purification  of  the  Alkaline  Released  Virus  from  LdMNPV 


Polyhedra 


1.  Add  one  third  volume  of  DAS   (final  concentration:   0.1  M 
Na2C03,    0.01  M  EDTA,    0.17  M  NaCl ,   pH  10.9)    to  the 
polyhedra  solution  and  mix.     Keep  the  polyhedra  solution 
on  ice  all  the  time.     If  the  polyhedra  is  not  dissolved, 
add  a  few  drops   (100  ul)   of  0.5  M  NaOH  to  the  polyhedra 
solution,   and  vortex  the  solution. 

2.  Prepare  a  sucrose  gradient  from  40%  to  56%. 

3.  Centrifuge  at  24,000  rpm  for  1  hr  at  4°C. 

4.  Transfer  the  different  bands    (alkaline  released  virus 
with  different  numbers  of  nucleocapsids)   to  a  new  tube. 

5.  Add  TE  to  fill  the  tube  and  mix  well.     Centrifuge  at 
24,000  rpm  for  30  min. 

6 .  Discard  supernatant . 

7.  Resuspend  the  virus   (alkali-released  virus)   in  500  ul  TE 
buffer . 


Viral  DNA  Purification 


1.  Add  40  (.il  of  20%  SDS  to  the  alkaline  released  virus 
solution . 

2.  Incubate  10  min  at  room  temperature. 

3.  Add  10-25  ul  proteinase  K   (5  mg/ml)    and  incubate 
overnight  at  3  7°C. 


120 


4.  Extract  DNA  with  0.75  ml  distilled  phenol   (saturated  with 
TE) .   Invert  tubes  gently.     Spin  in  microcentrifuge  for 
about  1  min.     Transfer  upper  aqueous  phase  to  a  clean 
microcentrifuge  tube.     Extract  the  aqueous  phase  twice 
more  with  phenol . 

5.  Extract  the  aqueous  phase  three  times  with  0.75  ml  water 
saturated  ether. 

6.  Heat  the  DNA  solution  at  56°C  for  15  min     in  a  heat  block 
with  caps  open  to  evaporate  ether. 

7.  Dialyze  the  DNA  solution  4  times  against  1  L  TE   (2  times 
daily  for  2  days) . 

8.  Measure  the  DNA  concentration  by  reading  optical  density 
(OD)   at  260  nm.     The  DNA  concentration  (ug/ml)   equals  to 
OD260  X  50. 


APPENDIX  E 

PARTIAL  NUCLEOTIDE  AND  TRANSLATED  AMINO  ACID  SEQUENCES  OF 
Lyman tria  dispar  MNPV  GP41  GENE 


1  TATGATAAGTAGTCCTCGGGTGGAGTTTTGCGAGCATCATGCAGTCCGAG 

M     Q     S  E 

5 1  CCCGCTGACCGCGACGCGGCGGCCGCCGTCTACAGCGCCGCCTGGATGAA 
DAAAAVYSAAWMNQCVD 
101  AC  CGGGT  CAT  CAAGT ACTAT  CGC AC  C AACGACATGT  CC  CACTTGACGCCC 
YVDPADRRVI  KYYRTND 
151  CAGATGCAATCCAGTGCGTGGACTACGTGGTGCTGATCAACACCATTCGC 
MSHLTPQMQLLINTIR 

2  01  GACCTGTGCCTGGACACCAACCCGGTGGACGTGAACGTGGTGAAGCGCTT 

DLCLDTNPVDVNVVKRF 
251  CGACAGCGACGAGAACCTGATCAAGCACTACGCGCGCCTCGCCAAGGACA 
DSDENL     I     KH  YARLAKDM 

3  01  TGGGCGGCTCGGCGGTGCCCGACAACGTGTTCCAGCCCTCTTTCGTCTAC 

GGSAVPDNVFQPSFVY 
3  61  ACCGTCCTGCCGGCCTACGCGCAAAAGTTTTACAACAAGGGT 
TVLPAYAQKFYNKG 

*  TAAG  is  the  transcription  start  site  of  baculovirus  late 
genes 

*  ATG  is  the  protein  translation  start  site 
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